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Abstract 

This report describes 12 studies dealing with the knowl- 
edge monitoring component of metacognition. It is 
assumed that knowledge monitoring is basic to other 
metacognitive activities, such as evaluating learning, se- 
lecting appropriate strategies, or planning, because 
distinguishing between what students know and do not 
know ought to be a prerequisite for these other higher 
level acitivities. The 12 studies, 10 in the verbal domain 
and two in mathematics, used various versions of a 
knowledge monitoring assessment (KMA) which evalu- 
ates the discrpancy between students’ estimates of their 
knowledge in a domain and their demonstrated knowl- 
edge in that domain based on performance on a multiple 
choice test. The results provide a good deal of support for 
the construct validity of the KMA and suggest that it has 
considerable generalizability over different types of con- 
tent and varying student populations. Since the KMA may 
be group or computer administered and is objectively 
scored, it has substantial advantages over other means of 
evaluating metacognition. Suggestions for further research 
using the procedure are made. 

Introduction 

Metacognition has been defined as the ability to mon- 
itor, evaluate, and make plans for one’s learning (Flavell, 
1979; Brown, 1980). Research has shown that learners 
with effective metacognitive skills are more capable of 
making accurate estimates of what they know and do not 
know, of monitoring and evaluating their ongoing 
learning activities, and of developing plans and selecting 
strategies for learning new material. A large body of liter- 
ature has addressed differences in metacognitive abilities 
between learning disabled and regular students, as well as 
between generally capable learners and their less able 
counterparts (Schraw, in press). This research clearly indi- 
cates that metacognitive abilities are critically important 
for effective learning. 

Metacognitive processes are usually divided into three 
components (Pintrich, Wolters, & Baxter, in press): 
knowledge about metacognition, monitoring of metacog- 
nitive processes, and control of those processes. The 
research described in this report concentrates on the mon- 
itoring component of metacognition, specifically, 
students’ ability to monitor their learning by differenti- 
ating between the known and unknown. It is assumed 
that effective control of learning cannot occur in the ab- 
sence of accurate monitoring. If students cannot 
distinguish between what they know and do not know, 
they can hardly be expected to exercise control over their 


learning activities or to select appropriate strategies to at- 
tain their goals. 

Our concern with assessing the ability to monitor 
knowledge is based on the reasoning that it is a crucial 
component in most learning and instructional contexts. In 
situations where students have to master a great deal of 
new knowledge, those who can accurately distinguish be- 
tween what they have already learned and what is yet to 
be acquired have an important advantage, because they 
can skip over material that has already been mastered, or 
merely briefly review it. Such students can then devote 
most of their time and energies to new, unfamiliar mate- 
rial. In contrast, those with less adequate knowledge 
monitoring processes are likely to allocate their time and 
resources less effectively and to spend valuable time 
studying what is known at the expense of unfamiliar ma- 
terial, and consequently, to have greater difficulty 
mastering new subjects. For these reasons, the program of 
research described in this report concentrated on the de- 
velopment of a procedure to assess students’ ability to 
monitor their knowledge and to differentiate between 
what they believe they know and do not know and what 
they actually know and do not know. 

The purposes of this report are both to describe the 
metacognitive knowledge monitoring assessment (KMA) 
we have developed and to report on a program of re- 
search— 12 studies in all — that relate scores on the KMA 
procedure to reading comprehension, problem solving in 
mathematics, and, more generally, to learning in class- 
room settings. In addition, we report analyses of scores on 
the KMA to such variables as anxiety, interest, and need 
for feedback, and examined the usefulness of the 
procedure in distinguishing learning-disabled and atten- 
tion-deficit hyperactive students from those without 
special educational needs. All the studies described in this 
report used the KMA, a procedure that may be adminis- 
tered as either a paper-and-pencil or a computer-based 
assessment. Unlike other assessments of metacognitive 
processes, the KMA is objectively scored and does not rely 
on self-reports of cognitive processing. 

Assessing Metacognition 

Despite its importance in meaningful human learning, 
the assessment of metacognition has proven to be both dif- 
ficult and time-consuming (Pintrich, et al, in press). 
Metacognition, an executive process (Borkowski, Chan, Sc 
Muthukrishna, in press), monitors and coordinates the cog- 
nitive processes employed during learning, so, as might be 
expected, there are considerable difficulties in assessing such 
higher-level processes. Metacognition is usually assessed in 
two principal ways: by observing students’ performance or 
via self-report inventories. Problems associated with each of 
these forms of assessment are discussed below. 
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Observation and Verbal Reports 

Assessing metacognition via observation or verbal re- 
ports usually requires all of the following: 

1 ) that students work on some task individually; 

2) that their performance is carefully observed; and 

3) that it is recorded in some way (notes taken by ob- 
servers or audio/videotapes). 

Often additional steps are required before metacogni- 
tion can be effectively rated, including detailed interviews 
with students, the development of “think aloud” proto- 
cols collected as students work on a learning task, and the 
recording of students’ introspective reports. Multiple 
raters are usually required to inspect both the performance 
records and the interviews or introspection protocols be- 
fore an effective rating of metacognition can be made 
(Meichenbaum, Burland, Gruson, & Cameron, 1985). 
Referring to this approach, Royer, Cisero, and Carlo 
(1993) noted that, “The process of collecting, scoring, and 
analyzing protocol data is extremely labor intensive” (p. 
203). The resources for such efforts are rarely available in 
most instructional settings, and even in many university- 
based research programs. Pressley’s work (Pressley & 
Afflerbach, in press, 1995) provides a good example of the 
complexities of conducting protocol analysis and Baker 
and Cerro (in press) also discuss problems with this 
approach, especially when it involves assessing metacogni- 
tion through a process of error detection. 

Labor-intensive practices such as those described above 
make it difficult to evaluate metacognition in many in- 
structionally relevant settings, including secondary and 
postsecondary institutions, as well as training environ- 
ments in business and industry, government agencies, and 
the military. In view of these difficulties, it is not sur- 
prising that most metacognitive research is conducted in 
elementary and some secondary school settings where the 
time of those participating in the research can be diverted 
for the research effort. Of course, substantial resources 
still have to be devoted to enable researchers to collect 
such data. 

Self-Reports 

A number of self-report measures of metacognition 
(Everson, Hartman, Tobias, & Gourgey, 1991; Jacobs &C 
Paris, 1987; O’Neil, 1991; Pintrich, Smith, Garcia, & McK- 
eachie, 1991; Schraw & Dennison, 1994) are in widespread 
use. Such questionnaires have the advantage of being easily 
administered to groups and may be scored rapidly and ob- 
jectively. Self-report scales usually ask respondents to select 
from a set of printed choices the cognitive processes and 
strategies they use while learning. Such scales put a pre- 
mium on effective reading abilities and, therefore, are 
usually not suitable for use with young children. 


Unfortunately, the use of self-report measures in as- 
sessing a complex process such as metacognition raises a 
variety of questions, including the following: Are students 
aware of the cognitive processes they use during learning? 
Further, are students able to describe and report on the 
processes used, even by merely selecting from available 
multiple-choice alternatives? Finally, there is the question 
of whether students report honestly on the processes they 
use. While the truthfulness of students’ answers is always 
an issue with any self-report, it may be of particular im- 
portance with respect to reports of cognitive processes used 
during learning, because students at any level are probably 
reluctant to admit that they may be relatively casual in their 
attempts to complete school assignments. Of course, these 
concerns are minimized if appraisals of any construct, 
metacognition in particular, do not rely on self-reports. 

Rationale for Assessing 
Knowledge Monitoring Ability 

Each of the studies reviewed below employed a tech- 
nique for assessing knowledge monitoring ability that 
simultaneously evaluated students’ self-reports of their de- 
clarative word knowledge, or their procedural 
problem-solving ability in math, and their demonstrated 
knowledge or ability. The basic strategy is to assess 
knowledge monitoring processes by evaluating the dis- 
crepancy between students’ estimates and their actual 
(determined by performance on a test) knowledge or 
ability. The KMA was applied to the domain of students’ 
declarative word knowledge in 10 of the 12 studies de- 
scribed in this report. This domain was selected because of 
its relevance to learning in a classroom setting. To demon- 
strate that the procedure generalizes to other academic 
domains such as math or science, two studies dealt with 
students’ procedural knowledge in solving mathematical 
problems, another important domain in classroom 
learning at all levels. 

On the KMA, students are first asked to estimate their 
knowledge of words or their ability to solve math prob- 
lems. Actual word knowledge or problem-solving ability 
is subsequently assessed by administering an objectively 
scored test, most frequently in multiple-choice format. 
The discrepancies between students’ estimates and their 
actual knowledge are used as an index of the accuracy of 
students’ ability to monitor metacognitive knowledge. 

The KMA generates four scores that reflect the rela- 
tionship between students’ estimates of their knowledge 
and their test performance. Two scores compare students’ 
estimates of their knowledge or ability to solve a problem, 
and whether: 

1) they answered the question correctly on a test (ab- 
breviated as + + ), 

2) or answered it incorrectly ( + — ). 



Two further scores indicate that students estimated 
that they did not know an item or were unable to 
solve a problem and 

3) answer it correctly ( — + ) or 

4) incorrectly ( ). 

Of course, the + + and scores are assumed to reflect 

accurate knowledge monitoring estimates, and the + — 
and — + scores inaccurate estimates. 

As is true of other types of metacogmtive measures, the 
KMA estimates also consist, in part, of self-reports. How- 
ever, because of their reliance on working memory, the 
information in such reports typically is more readily avail- 
able to students than when evoked through questions 
appearing on self-report inventories that often require rec- 
ollections of the cognitive processes engaged in during 
learning, and/or how frequently the processes were used. 
More important, the KMA also incorporates students' ac- 
tual performance on a test. Since estimated and actual 
performance can both be scored objectively, the procedure 
has a clear-cut advantage over asking students to report on 
their cognitive processes either in the form of protocols or 
by choosing alternatives on self-report inventories. 

Classroom assessments are often used to determine 
whether students have learned the material presented in 
class. However, it is also important to evaluate students’ 
ability to improve their knowledge and make accurate 
metacognitive estimates of whether the new material has 
been learned, in addition to assessing whether they have 
retained prior learning. Consequently, several of the 
studies reported below also examined students’ accuracy 
in assessing whether they had mastered the instructional 
materials presented in the research. 

Finally, the research described below examined the re- 
lationship of KMA scores and measures of reading 
comprehension, classroom learning, anxiety, interest, and 
need for feedback, as well as whether the KMA distin- 
guished between students diagnosed as being either 
learning disabled or having an attention-deficit disorder 
and those not having special educational needs. 

Reports of the studies are organized according to the 
variables examined. Because a number of the investiga- 
tions dealt with multiple variables, some studies appear 
under more than one heading. In such instances, a detailed 
report of the study is given when it is first described, and 
the reader is directed to that description in subsequent, 
briefer references. 

Knowledge Monitoring Ability 
and Reading Comprehension 

A good deal of research has demonstrated that word 
knowledge or vocabulary is one of the major components 


of reading comprehension and learning generally (Bre- 
land, Jones, & Jenkins, 1994; Just & Carpenter, 1987). 
However, few investigations have examined whether the 
accuracy of students’ estimates of their word knowledge is 
an important predictor of ability to learn. If students are 
unable to accurately differentiate between the words they 
know and do not know, they will find it difficult to deter- 
mine whether to slow down while reading and try to 
figure out the meaning of a word from the context, or go 
to a dictionary to define it, or go on in uncertainty or in 
che possibly mistaken belief that they understand the 
word’s meaning. Such uncertainty will reduce reading 
comprehension among students with inadequate knowl- 
edge monitoring ability. On the other, hand being able to 
accurately distinguish between words they can define cor- 
rectly and those they cannot should enhance students’ 
reading comprehension and their ability to learn new ma- 
terial. Because a great deal of research on metacognition 
has dealt with reading comprehension, the first two 
studies used the KMA's relationship to measures of 
reading comprehension as a criterion for assessing the va- 
lidity of the KMA. 

Study I. Estimates of Word Knowledge and 
Reading Comprehension 1 

In view of the demonstrated relationships between 
metacognition and reading comprehension, it seemed im- 
portant to evaluate the accuracy of students’ monitoring 
of their word knowledge within the context of reading. 
This was expected to increase the relevance of the KMA 
to classroom learning. It was also anticipated that the 
ability to learn new vocabulary would be an important 
skill for reading specifically, and classroom learning gen- 
erally. Furthermore, students’ ability to make accurate 
metacognitive assessments of whether they have actually 
learned the meanings of new words also seems to be an 
important indicator of reading comprehension. Therefore, 
students’ ability to improve their knowledge and to make 
metacognitive estimates of that improved knowledge were 
also assessed in this study. 

Participants and Procedures 

A total of 169 freshmen at a large urban university par- 
ticipated in this study. The students, considered to be at 
risk of doing poorly in college, attended a summer session 
program designed to familiarize them with the university 
and the skills needed to succeed in their studies. Partici- 
pants were randomly assigned to one of two groups. The 
first group of <82 students was asked to read a 750-word 
passage and then complete a word list and vocabulary test 
composed of words that had been defined explicitly or im- 
plicitly in the text. The passage described the incidence 
and prevalence of heart disease, the risk factors for devel- 
oping heart ailments, the technical terms for varying 
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degrees of the illness, the characteristics differentiating the 
varying degrees, and a number of ways by which the risks 
of developing heart disease could be reduced. Earlier re- 
search (Tobias, 1989; 1969) indicated that there was a 
good deal of variability in participants’ prior knowledge 
of this material. The second was administered the Sen- 
tence Verification Test (SVT) (Royer, Lynch, Hambleton, 
& Bulgarelli, 1984) rather than reading the passage. 

All participants were asked to indicate, by checking off 
one of two blanks on the word list, whether they knew or 
did not know each of 33 words. All of the words were de- 
fined, either explicitly or implicitly, in the passage on heart 
disease previously administered to the first group. When 
the word list was completed, students took a four-choice 
vocabulary test containing all 33 words on the list with in- 
structions to select the correct synonyms or definitions of 
the words. To determine the participants’ level of reading 
comprehension, the Descriptive Test of Language Skills, 
Reading, and Comprehension (DTLS) (College Board, 
1979) was administered. 

The reading passage, word list, and vocabulary test 
were examined by four raters to ascertain whether the 
words appearing in the word list were defined implicitly 
or explicitly in the text. The passage was revised until con- 
sensus was reached among the judges. Of the 33 words, 
the ratings indicated that 25 were defined implicitly (e.g., 
“ Epidemiologists who have compared the prevalence of 
heart disease in the United States and in other coun- 
tries...”) and eight words were defined explicitly (e.g., 
“ Coronary or heart disease...”). 

Results and Discussion 

The accuracy of students’ estimates of their metacognitive 
word knowledge was assessed by comparing those esti- 
mates with students’ subsequent performance on the 
vocabulary test. The four scores described earlier were 
generated. Terms on the word list checked as known were 
scored: 

1) correct ( + + ), or 

2) wrong (+- — ) on the vocabulary test. 

Two further scores described terms students 
checked as unknown on the word list and an- 
swered, 

3) correctly (— + ), or 

4) incorrectly ( ) on the vocabulary test. 

These four KMA scores were computed for the total set 
of words and separately for those words defined explicitly 
and implicitly in the passage. To examine the relation be- 
tween KMA scores and reading comprehension, the 
correlations between these scores and the DTLS subtest 
scores were computed and are shown in Table 1 for the 
entire sample, as well as separately for the group taking 
the Sentence Verification Test (SVT) and the group 
reading the heart disease passage. 


The correlations in Table 1 indicate that, as expected, 
accurate metacognitive estimates about the number of 
words students thought they knew, and answered cor- 
rectly on the rest, (Til), had a substantial positive 
relationship w ith reading comprehension. Estimates of the 
number of words thought to be unknown, and answered 
incorrectly, (T22), were negatively related to reading 

Table 1 


Zero-Order Correlations for Selected Variables with 
DTLS Reading Comprehension Scores 



Entire Sample 

SVT Grout 

Reading 
Passage Group 

T+ + 

.4655** 

.2913* 

.6474** 

T — 

-.4330** 

-.3721** 

-.5442** 

T- + 

-.1803 

-.0885 

-.2600 

T+- 

.0678 

.2027 

-.0825 

E+ + 

.3263** 

.0808 

.5221** 

I + + 

.4662** 

.3185* 

.6302** 

E — 

-.3349** 

-.2894* 

-.4196** 

I — 

-.4413** 

-.3822** 

-.5438** 

E- + 

-.1390 

-.1715 

-.1151 

I- + 

-.1626 

-.0523 

-.2827* 

E+- 

.1586 

.3295* 

.0389 

I+- 

.0140 

.1095 

-.0877 


T = Total score on word list task. 

E - Words defined explicitly. 

I = Words defined implicitly. 

+ + = Words Students claimed to know and got right on vocabulary test. 

= Words Students claimed not to know and got wrong on vtx'abulary test. 

+ — = Words Students claimed to know but got wrong on vocabulary test. 

— 4- = Words Students claimed not to know but got right on vocabulary test. 

* p< .01 
** p< .001 

comprehension. Lurthermore, and also anticipated, accu- 
rate estimates of words defined explicitly (E++ and 
E ) and implicitly (I++ and I ) were also signifi- 

cantly correlated with reading comprehension, whereas 
incorrect estimated (E— +, E+— , I— +, and 1+ — ) were 
not. The magnitude of many of the correlation coefficients 
is especially impressive because the participants were rela- 
tively homogeneous with respect to ability — they were 
considered to be at risk of doing poorly in college and, 
therefore, advised to participate in the orientation and 
pre-freshman skills program offered by the university. 

The relationships between KMA scores and reading 
comprehension were dramatically lower for the second 
group of students who did not read the passage and in- 
stead took the SVT. Those reading the passage had the 
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opportunity to learn the meanings of previously unknown 
words and improve their knowledge of familiar or par- 
tially known words, whereas the students who were given 
the SVT did not. It was expected that students who could 
improve their word knowledge would make more accu- 
rate metacognitive estimates than would the others. 
Operationally then, it was expected that group member- 
ship (i.e., reading the heart disease passage or taking the 
SVT) and accuracy of metacognitive estimates would have 
an interactive effect on reading comprehension. This hy- 
pothesis was tested using multiple regression analysis, in 
which a binary vector for group, the KMA score, and their 
product (representing the interaction term) were entered 
as independent variables and the reading comprehension 
test score was the dependent variable. The results of that 
analysis are shown in Table 2. 

As expected, the interaction term was significant in five 
of the six equations shown in Table 2. These results indi- 
cated that students who could improve their word 
knowledge by reading the passage made significantly 
more accurate metacognitive estimates than did those 
who did not have that opportunity. This finding is not sur- 
prising since the major skill assessed for the group reading 
the passage was probably the ability to infer the meaning 
of words, surely an important component of reading com- 
prehension. Clearly then, the opportunity to improve 
word knowledge and then estimate mastery of the im- 
proved knowledge increased reading comprehension. 

Table 2 


Beta Weights and Associated t Test Results for All Effects 
on All Derived Scores 



Group 

Score 

Group x Score 

Score 

Beta 

t 

Beta 

t 

Beta 

t 

T+ + 

-.10 

.44 

-.62 

3.33** 

.88 

2.67** 

T — 

.12 

.97 

.04 

.19 

-.54 

2.32* 

E+ + 

-.42 

3.00"* 

-.34 

1.22 

.85 

2.60* 

E — 

.07 

.56 

.03 

.14 

-.41 

1.70 

[ + + 

-.59 

3.14* * 

-.07 

.29 

.82 

2.55* 

I-- 

.10 

.84 

.00 

.02 

-.50 

2.17* 


T - Results for total word list. 

E = Results for words defined explicitly. 

I = Results for words defined implicitly. 

-r + = Words students claimed to know and got right on vocabulary test. 

t — = Words students claimed to know but gor wrong on vocabulary test. 

— = Words students claimed not to know and got wrong on vocabulary test. 

— + = Words students claimed not to know but got right on vocabulary test. 

— p < .05 

** p<. 01 


Estimates and Actual Number Correct 

On the KMA, the raw score is obtained by adding the 
+ + and — + scores. The KMA scores described above 
were a function of two factors: Knowledge as reflected in 
the number of items students answered correctly on the 
vocabulary test, and estimates of knowledge as reflected 
by how accurately students estimated the number they 
would get right. One question that arises is whether stu- 
dents’ estimates of their knowledge contributes above and 
beyond their actual word knowledge as reflected by the 
raw vocabulary score. Of course, a great deal of research 
has demonstrated that students’ scores on vocabulary tests 
are highly related to reading comprehension and class- 
room learning generally (Breland, Jones, & Jenkins, 1994; 
Just & Carpenter, 1987). For the KMA to be useful, it 
should account for more variance than the number of vo- 
cabulary items students answered correctly, irrespective of 
their knowledge estimates. This question was examined in 
the first study, and in the subsequent investigations de- 
scribed later in this report. 

The correlation between the raw score on the vocab- 
ulary test (total number of words correct) and the DTLS 
was .45. As Table 1 indicates, the largest correlation 
among the metacognitive estimates and reading ability, 
r- .65, was between the total number of words estimated 
to be known and actually known (T+ + ). The difference 
in the magnitude of these correlations suggests that ac- 
curate estimates of word knowledge contributed 
variance above and beyond the total vocabulary score. 
When T++ was forced into a regression equation, the 
total number of words correct, irrespective of prior esti- 
mates, did not contribute enough independent variance 
to enter the equation, indicating that the accuracy of stu- 
dents’ estimates of their improved vocabulary were more 
highly related to reading comprehension than their vo- 
cabulary test scores alone. The results of this first study 
were encouraging with respect to the construct validity 
of the KMA. 

Study II. Declarative Word Reading Ability, 
Knowledge Monitoring and Reading 
Comprehension 2 

The preceding study found strong relationships be- 
tween metacognitive monitoring ability and reading 
comprehension in general. The purpose of the second 
study was to determine the KMA’s relationship both to 
prior reading ability and to some of the components of 
reading comprehension, such as identifying words in con- 
text, understanding meaning, and understanding the 
writer’s tone and assumptions. We reasoned further that 
knowledge monitoring may be more readily measured 
through the use of signal detection methods (Green &c 
Swets, 1966; Macmillan & Creelman, 1991), which sepa- 
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rates feelings of knowing and other long-term memory-re- 
lated phenomena into signal and noise components. 
Therefore, a further purpose of this study was to examine 
whether the signal detection paradigm could define more 
useful measures than the KMA scores used in Study I. Fi- 
nally, relationships between KMA scores and measures of 
test anxiety were examined in this study, but discussion of 
those particular findings will occur later in the report. 

Participants and Procedures 

The word list and vocabulary test used in the first study 
were administered to students, together with the worry 
subscale of the Test Anxiety Inventory (Spielberger, Gon- 
zalez, Taylor, Anton, Algaze, Ross, & Westberry, 1980) 
and the Descriptive Test of Language Skills, Reading, and 
Comprehension (DTLS) (College Entrance Examination 
Board, 1979), which contained three subscales: identi- 
fying words in context, understanding meaning, and 
understanding the writer’s tone and assumptions. An 
index of reading ability based on an earlier administration 
of the DTLS was obtained from the participants’ school 
records. The participants were 117 undergraduates at a 
large urban university, 65 percent of whom were women. 

Results and Discussion 

Knowledge monitoring ability was assessed by com- 
puting “hits” (the number of words students claimed to 
know and subsequently identified correctly on the vocab- 
ulary test or conversely said they did not know and failed 
to identify correctly on the vocabulary test) and “false 
alarms” (the number of words students claimed to know 
but did not correctly identify and those they claimed not 
to know yet correctly identified). Using signal detection 
theory, these two indices were transformed into a d’ index 
that provides an estimate of sensitivity to metacognitive 
monitoring and B, an index that provides an estimate of 
the participants’ response bias. These two indices had a re- 
liability estimate of .78 (Cronbach, 1951). 

In general, more capable readers demonstrated higher 
levels of metacognitive monitoring ability. The correla- 
tions of knowledge monitoring ability — as measured by 
the d' index— -with prior reading ability and the experi- 
mental measure of reading comprehension were .35, and 
.39, respectively. Moreover, hierarchical multiple regres- 
sion analyses permitted isolation of the effects of 
metacognitive monitoring ability on reading test perfor- 
mance, once prior reading ability and anxiety were 
controlled statistically. These analyses suggested that 
metacognitive monitoring ability was positively related to 
reading test performance ( B = .17, t = -2.23, p - .03). Sim- 
ilarly, the correlations with the subscales on the reading 
test measuring vocabulary in context, literal interpretation 
of text, and understanding the writers’ tone and assump- 
tions were .32, .43, and .26, respectively. 


Contrasts with Study I 

In Study II, the reading passage in which all of the vo- 
cabulary words were defined was not administered. The 
correlation of .35 for the d’ score was similar to the cor- 
relation of .29 (see Table 1) found in the first study 
between T+ + and reading comprehension for those stu- 
dents who did not read the passage. Of course that 
relationship is much weaker than the correlation of .65 
found in Study I between the same variables for students 
reading the passage. Clearly, then, these two studies sug- 
gest that the metacognitive word knowledge scores 
derived from the KMA had a strong, consistent relation- 
ship with standardized measures of reading 
comprehension and, further, that the opportunity to im- 
prove word knowledge and reestimate mastery of the 
improved knowledge increased the relationships with 
reading comprehension. 

Knowledge Monitoring Ability 
and Classroom Learning 

The first two studies were encouraging with respect to 
the relationship of word knowledge monitoring to reading 
comprehension. The results of these investigations indi- 
cated that students’ metacognitive estimates of their word 
knowledge were closely related to competence in the do- 
main in which the estimates were obtained, i.e., reading. 
One purpose of the studies to be described below was to 
examine whether the KMA score was related to more gen- 
eral academic domains such as classroom learning. The 
expectation of a relationship with classroom learning 
seemed reasonable because being able to accurately esti- 
mate one’s knowledge should make it easier to acquire the 
large amounts of new information taught in such settings. 

Four studies attempting to assess the relationship of 
KMA scores to classroom learning will be described 
below. Furthermore, because the vocabulary test and 
reading passage used in this set of studies dealt largely 
with familiar material using a minimum of technical vo- 
cabulary, the task of inferring the meanings of unknown 
words from the passage or estimating word knowledge 
seemed similar to learning in courses that rely largely on 
conventional vocabulary, rather than those that introduce 
a large set of new technical terms. Therefore, it seemed 
likely that declarative word KMA scores should be more 
closely related to students’ learning in English and hu- 
manities courses than in science and social science courses. 

Another purpose of the succeeding studies was to ex- 
tend the research on metacognitive knowledge monitoring 
ability to learning in secondary and postsecondary institu- 
tions. As mentioned above, much of the research dealing 
with metacognition has been conducted in elementary 
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schools and to a lesser degree in secondary or postsec- 
ondary settings. Two of the succeeding studies examined 
the relationship of knowledge monitoring ability to stu- 
dents’ overall achievement in college, and to learning in 
different content domains, while two others examined 
these parameters among high school students and those 
who had dropped out of school. 

Study III. Knowledge Monitoring Ability and 
Achievement in College 3 

Students acquire a great deal of new knowledge during 
their secondary and postsecondary educational experi- 
ences. Therefore, their ability to estimate whether they have 
mastered either previously learned content or new material 
seemed to be an important characteristic of effective 
learners, especially in college. Accurate monitoring should 
enable students with effective knowledge monitoring 
strategies to concentrate on new materials and skim over 
familiar content. On the other hand, students with less ef- 
fective knowledge monitoring ability may waste time 
practicing or reviewing what they already know, rather 
than zeroing in on new material or updating partially 
learned content. Therefore, Studies III and IV asked stu- 
dents to estimate their vocabulary knowledge twice: the 
first time to assess their prior learning and the second to de- 
termine their ability to improve prior learning. On the basis 
of the first study, presented earlier, we presumed that im- 
proved estimates of word knowledge would be more 
closely related to learning in college, as reflected in their 
grade point averages (GPA), than would estimates derived 
from prior learning. 

The word list, vocabulary test, and reading passage 
used in the two studies reported above contained more ex- 
plicitly defined words than implicitly defined words. 
However, it was assumed that implicit definitions might 
be especially important at the college level, where students 
frequently have to infer the meanings of new words from 
context. Therefore, all the materials were modified to in- 
crease the number of implicitly defined words. 

Participants and Procedures 

The sample consisted of 139 students attending a large 
urban university, though only 84 subjects completed all 
the materials during two sessions. Part of the sample con- 
sisted of students entering the nursing program (N = 47, 
N - 33 with complete data) who were taking an orienta- 
tion course in nursing. The rest of the sample consisted of 
freshmen (N = 92, N = 51 with complete data) taking a 
freshman orientation course. 

The word list, vocabulary test, and reading passage 
were revised to contain an equal number of target words 
that were defined explicitly and implicitly in the passage. 
The expository passage used in one of the previous studies 
was revised and a narrative version of the same passage 


was developed to examine the effect of situational interest 
on metacognitive knowledge monitoring ability. (Findings 
dealing with interest will be discussed later in this report.) 

The word list and vocabulary test were also revised and 
contained 38 words, half explicitly defined and half implic- 
itly. Types of definitions were determined by two 
independent judges who rated all words. Disagreements 
were resolved by revising the passage until agreement was 
reached. Because these materials will be used in six of the 
studies described later in this report, a sample, consisting of 
the first page of the materials, is shown in Figures 1 to 3. 

The word list and vocabulary test (alpha reliability = 
.80) were administered in the first session. Students were 
then randomly assigned to read one of two versions of the 
text in the second session, followed by a readministration 
of the word list and vocabulary test. This took place 
during students’ classes in the presence of their instructors. 

Results and Discussion 

The correlation between total score on both adminis- 
trations of the vocabulary test, based on the 84 students 
who completed both administrations of the test, was .75. 
This is not a test-retest reliability coefficient because stu- 
dents read the passage from which the meaning of the 
words could be inferred immediately before the second 
administration of the vocabulary test. 

Students’ estimated word knowledge and performance 
on the vocabulary test were determined for both 
administrations. Two scores were computed for each ad- 
ministration: the total number of correct (words in the 
+ + and categories) and incorrect (+— and — + cat- 

egories) estimates. Preliminary analysis revealed no 
differences between students assigned to the expository or 
narrative text versions, or between explicitly and implic- 
itly defined words; therefore the data for both versions of 
the text and both types of words were pooled. The corre- 
lations between the correct and incorrect estimates for 
both administrations and students’ GPAs in English, hu- 
manities, sciences, social sciences, and combined GPA 
were computed and are shown in Table 3. The overall 
GPA for the participants who were freshmen in their first 
term of college was based on an average of 12.1 credits 
( SD = 5.6), whereas the nursing students had a mean of 
56.4 credits (SD = 28.3). Therefore, the correlations are 
presented for each group separately, as well as for the 
total sample. Table 3 also shows the correlations between 
metacognitive knowledge estimates and raw score, and 
the number correct on the vocabulary test. 

The correlations shown in Table 3 are generally posi- 
tive and frequently significant, even though they ranged in 
magnitude from low to moderate. The results support the 
concurrent validity of the KMA procedure with respect to 
its relationship to learning in college. As expected, corre- 
lations between knowledge monitoring scores and GPA in 
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Figure 1. Sample of word list for knowledge monitoring procedure. 


Please indicate whether you know, or do not know each of the words listed below, by checking the appropriate space. 


Abuse 

Know 

Do not know 

Acute 

Know 

Do not know 

Ascribed 

Know 

Do not know 

Attenuate 

Know 

Do not know 

Attributed 

Know 

Do not know 

Benign 

Know 

Do not know 

Cholesterol 

Know 

Do not know 

Coronary 

Know 

Do not know 

Deterrent 

Know 

Do not know 

Diagnosis 

Know 

Do not know 

Efficacy 

Know 

Do not know 

Emanating 

Know 

Do not know 

Entity 

Know 

Do not know 

Epidemiology 

Know 

Do not know 

Esoteric 

Know 

Do not know 

Etiology 

Know 

Do not know 

Fatalities 

Know 

Do not know 

Genre 

Know 

Do not know 

Gravity 

Know 

Do not know 

Guarded 

Know 

Do not know 

Implicated 

Know 

Do not know 
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Figure 2. Sample of vocabulary items for the knowledge 
monitoring assessment procedure. 

For each word check the space which means most nearly the same thing as the first word. 


1) Prevalent 

a) stronger 

b) winning 

c) frequent 
d) prior 


6) Ascribe 

a) refer 

b) written 

c) question 

d) bed 


11) Infarction 

a) tooth decay 

b) particle 

c) rule violation 

d) muscle death 


2) Attributed 

a) caused 

b) ovation 


c stream 


_d) tax 


7) Transitory 


b) temporary 

c) carry 

d) train 


12) Fatalities 

a) fatty tissue 

b) deaths 

c) fateful 

d) take in stride 


3) Optimal 

a) best 

b) opening 

c) eyeball 

d) cheerful 


8) Median 

a) stripe 

b) divider 

c) middlemost 

d) negotiate 


13) Incidence 


a new cases 


_b) an example 
_c) exciting 
_d) event 


4) Obesity 

a) listen 

b) fat 

c) apology 

d) obsolete 


9) Ingest 

a) joke 

b) eat 


_c) enter 


d) exit 


14) Attenuate 

a) listen 

b) reduce 

c) pay attention 

d) try 


5) Acute 

a) pretty 

b) serious 

c) heavy 

d) often 


10) Residual 

a) lasting 

b) live 

c) income 

d) clever 


15) Guarded 

a) uncertain 

b) optimistic 

c) degrees 

d) watchful 


Please turn to next page to continue 
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Figure 3. Sample of expository reading passage for knowledge 
monitoring assessment procedure. 

Read this passage carefully: 


Coronary or heart disease is a major health problem among all ethnic, racial and occupa- 
tional groups in the United States. In addition to coronary disease, health workers are worried 
about many other maladies affecting Americans, such as cancer, AIDS, and other equally serious 
conditions. However, compared to all other serious illnesses, coronary problems cause more than 
half of the total number of fatalities or deaths in the United States. To be exact, 55% of the deaths 
among all groups in this country or more fatalities than for all the other illnesses combined, may 
be ascribed to coronary disease. Not only is coronary disease responsible for the greatest number 
of fatalities in this country, but it is also the most prevalent, or frequent, of all the serious illnesses. 
That is, coronary disease is more prevalent than all the other serious conditions combined. 

The incidence, that is the number of new cases, of coronary disorders, is higher among men 
than among women for the country as a whole. The incidence of heart disorders is also higher for 
cigarette smokers than it is among non-smokers. A higher incidence of coronary disease among 
Americans is also attributed to alcoholism, drug addiction, and tobacco. The etiology, or causes, 
of coronary disease among Americans are not completely clear, but excessive use or abuse of al- 
cohol and the other substances mentioned above is often linked to coronary disease. In addition, 
tension, air pollution, weighing too much, and engaging in too little exercise are also implicated as 
causes of heart disease among people living in the United States. 

The gravity of heart disease for people in general is a function of the magnitude of coro- 
nary damage. The heart is basically a muscle similar to all the others in the human body. The 
amount of damage to the heart muscle, or myocardium, determines the seriousness of the illness. 
The most serious type of damage, which is called myocardial infarction, occurs when the heart 
muscle dies. One major difference between the myocardium and other muscles in the human body 


in 


Table 3 

Correlations Between Knowledge 
and Overall Grade-Point Average 

Monitoring Scores, Raw Scores, 
in Different Subject Areas 




Variables 


Administration 1 



Administration 2 


Group 

Correct 

Estimate 

Raw 

Score 


Correct 

Estimate 

Raw 

Score 

Total GPA 

N 

r 

r 

N 

r 

r 

Total 

101 

.20” 

.01 

94 

.09 

-.00 

Freshmen 

65 

.09 

-.25 

6 i 

-.10 

-.21 

Nurses 

36 

.28* 

-.37* 

33 

.19 

.17 

English GPA 







Total 

72 

.30** 

.19 

63 

.19 

.05 

Freshmen 

53 

.31** 

.10 

48 

.00 

.16 

Nurses 

19 

.25 

.33 

19 

.45* 

.44* 

Humanities GPA 







Total 

82 

.26'** 

.04 

74 

.13 

.00 

Freshmen 

52 

.12 

-.21 

46 

-.11 

.22 

Nurses 

30 

.48** 

.40* 

28 

.35* 

.24 

Science GPA 







Total 

65 

.18 

-.01 

63 

.03 

-.07 

Freshmen 

28 

.11 

-.30 

27 

-.28 

-.47 

Nurses 

37 

.26 

-.42* 

36 

.18 

.26 

Social Science GPA 







Total 

64 

.18 

.26 

63 

.24* 

-.26* 

Freshmen 

26 

.15 

.10 

29 

.14 

.18 

Nurses 

38 

.09 

•31,00 

34 

.14 

.10 


* p < .05 
** pC.Ol 

English were generally highest; presumably the ability to 
accurately estimate word knowledge is more important in 
English than in other subjects. Relationships with human- 
ities GPAs and with the combined GPA were generally 
significant but lower than those with English GPAs; cor- 
relations with social science and science GPAs were 
generally lower and usually not significant. The largely 
nonsignificant relationships with social and behavioral 
science GPAs were surprising, because it had been as- 
sumed, perhaps naively, that these courses would present 
fewer technical terms and unfamiliar vocabulary than the 
natural science courses. Perhaps grades in these courses, as 
with those in science, reflect greater domain-specific 
knowledge than is true in English and humanities courses. 

The significance of the correlations reported in Table 3 
varies widely, probably as a function of at least three fac- 
tors. First, there is a different number of cases in each cell 
because some students were not present for both adminis- 
trations of the materials, leading to variability in the 
predictors. Second, college grades are often unreliable 
(Werts, Linn, & Joreskog, 1978; Willingham, Lewis, 


Morgan, & Ramist, 1990), reducing the magnitude of any 
correlations with them. Third, students completed varying 
numbers of courses in each area, thus GPAs may have 
been based on one or a few courses in some fields, re- 
ducing the stability of the criterion. The reliability of the 
grades may have been reduced further by three factors: 

1) students took dissimilar courses in each of the broad 
subject areas shown in Table 3; 

2) when similar courses were taken, they were taught 
by different instructors; and 

3) there were differences in students’ major fields of study. 

As expected, the correlations between knowledge mon- 
itoring scores and grades in English were generally higher 
and more frequently significant than was true of any other 
subject. For the 84 students for whom there were com- 
plete data for both administrations of the vocabulary test, 
the mean total score increased from 23.3 (SD = 6.0) for 
the first vocabulary test to 26.0 (SD = 6.6) for the second 
[f(83) = 5.53, p < .001]. Thus students clearly learned the 
meanings of some words after having the chance to im- 
prove their word knowledge by reading the passage. 
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However, in contrast to the results of the first study, the 
relationships between the metacognitive scores and grades 
shown in Table 3 were generally higher before students 
read the passage rather than afterward. The Study I find- 
ings of higher relationships with DTLS scores on the 
second administration of the KMA may be attributable to 
the use of reading comprehension scores rather than 
grades as criteria. Apparently inferring the meaning of 
words is a more important component of reading com- 
prehension than of classroom learning more generally. 

As mentioned, although it was assumed that having the 
chance to improve word knowledge before estimating it 
would be more similar to the way in which students learn 
in their courses than would the process of merely esti- 
mating prior word knowledge, the relationships with 
grades were not higher for the second administration of 
the KMA than for the first. While the increase in vocabu- 
lary score after the passage was read was statistically 
significant, fewer than three new words were learned from 
the passage. Perhaps such modest acquisition of vocabu- 
lary does not reflect the amount of learning that takes 
place in college courses, leading to lower relationships 
with metacognitive monitoring scores on the second ad- 
ministration. The degree of similarity between the 
knowledge monitoring task and classroom learning might 
have been greater if students had been instructed to study 
the passage more intensely or asked to pay special atten- 
tion while reading words they had previously seen on the 
KMA word list. Such instructions might have increased 
the correlations with GPA for the second administration. 
It remains for further research to explore that possibility. 

Table 3 also indicates that the correlations with the 
number correct on the vocabulary test were generally sim- 
ilar to the relationships with correct estimates of word 
knowledge. Due to the varying Ns in the different cells, 
the significance of differences in correlations was exam- 
ined using a t test developed by Hotelling (1940). 
Relationships with GPA based on the total group indi- 
cated that seven correlations with knowledge monitoring 
scores were higher than similar correlations with raw 
scores (one difference was significant at p < .05) while the 
correlations based on raw scores were higher three times 
(none significantly so). For freshmen, the correlations with 
knowledge monitoring scores were higher twice, though 
not significantly so, than those with raw scores while the 
correlations with raw scores were higher eight times (two 
significant at p < .05). Finally, for nursing students, corre- 
lations based on knowledge monitoring scores were 
higher five times (none significant), while relationships 
based on raw scores were higher five times (one significant 
at p < .05). Thus, the knowledge monitoring scores ap- 
peared to add little independent explanatory power to the 
relationship with grades beyond that accounted for by the 
number correct on the vocabulary test. 


The findings of this study, in contrast to the findings 
from the first two investigations, suggest that estimates of 
knowledge seem to account for little independent variance 
in GPA above that attributable to the number correct on 
the vocabulary test. Conceivably the low reliability of col- 
lege grades (Werts, Linn, & Joreskog, 1978; Willingham, 
Lewis, Morgan, & Ramist, 1990), referred to above, may 
have contributed to these findings. The criterion in the 
first two studies consisted of test scores, which are much 
more reliable than grades. 

Study IV. Predicting College Achievement from 
KMA Scores 

The preceding study dealt with concurrent validity in 
relating knowledge monitoring scores to students’ 
achievement in college. The fourth study investigated the 
predictive validity of KMA scores by examining whether 
metacognitive estimates of knowledge would predict en- 
tering students’ performance during their first year of 
college. 

Participants and Procedures 

The materials used were identical to those described in 
Study III. They were administered while students attended 
a pre-freshman skills program before beginning their first 
semester of college. Achievement was determined by ob- 
taining students’ GPAs at the end of their first year of 
college in the same subjects examined in the prior study: 
English, humanities, sciences, and social sciences, as well 
as the combined GPA. The sample consisted of 115 stu- 
dents (59 female) participating in a skills program 
intended for students considered at risk of doing poorly in 
their first year of college. 

Each participant in Study IV completed all of the study 
materials and took similar types of courses. High- and 
low-achievement groups were created by dividing stu- 
dents at the GPA median for the different academic areas 
and for the combined GPA. Then differences in knowl- 
edge monitoring ability between the groups were 
examined. Mixed between- and within-subjects analyses 
of variance were computed to determine the significance 
of differences between the first and second administra- 
tions, and of differences in estimates of knowledge 
between groups above and below the GPA median. At the 
conclusion of freshman year, it was determined that 95 of 
the 115 original participants had completed some courses 
at the college. 

Results and Discussion 

The number of correct estimates students made of their 
word knowledge was determined. As in the prior studies, 
correct estimates were defined by combining the -M- and 

categories. Preliminary analysis again indicated that 

there were no differences between the results obtained for 
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the expository and narrative passages, nor between those 
for the words defined explicitly or implicitly. Therefore, 
these data were pooled for the succeeding analyses. 

ANOVA indicated that, as expected, students above 
the median GPA (N = 48) made significantly more accu- 
rate overall estimates of their knowledge [mean = 49.2, 
F(l,93) = 6.42, p < .05] on both administrations than did 
those below the median (N = 47, mean = 45.8); the size of 
that effect, determined by eta 2 (SPSS, 1993), was .065. 
Also as expected, there was a significant difference be- 
tween the first (mean = 22.9) and second administration 
(mean = 24.5) of the word list and vocabulary test 
[F(l,93) = 14.95, p < .01, eta 2 = .138], though there was 
no interaction between these variables. A similar analysis 
was made using the number correct on both administra- 
tions of the vocabulary test as the dependent variable. The 
analysis indicated that the differences between the high- 
(mean = 43.2) and low-GPA group (mean = 39.3) on the 
vocabulary test were not significant [F(l,93) = 2.73, eta 2 
= .029], while the differences between the first (mean = 
17.7) and second administrations (mean = 24.5) were 
highly significant (P( 1,93) = 198.04, p < .001, eta 2 = .68]; 
again there was no interaction between variables. 

High- and low-achieving groups in English, humani- 
ties, science, and social science courses were also identified 
by dividing the students at the median GPA for each of 
these subject areas and examining the significance of dif- 
ferences in the number of correct estimates of knowledge. 
In English, the overall differences in the accuracy of the es- 
timates between students above (mean = 48.9) and below 
the median (mean = 45.4) were significant [F( 1,82) = 6.18, 
p = < .02, eta 2 = .07], as were the differences between the 
first (mean = 45.6) and second administrations [mean = 
48.7, E(l,82) = 11.92, p < .01; eta 2 = .127]. Furthermore, 
there was an interaction between groups and administra- 
tions [F( 1,82) = 4.41, p < .05; eta 2 = .051]. The 
interaction, as shown in Figure 4, suggests that while the 
accuracy of both groups’ estimates of known and 



Figure 4. Interaction of English GPA groups, hits, and test administrations. 


unknown words increased from the first to the second ad- 
ministration, higher achieving students made greater 
gains. A similar analysis was computed for the number 
correct on both vocabulary test administrations. The find- 
ings indicated that the difference between the high- (mean 
= 42.9) and low-GPA groups [mean = 38.9, F( 1,82) = 
5.43; eta 2 = .062] was slightly smaller than that deter- 
mined when the metacognitive knowledge scores were 
used, but there was a stronger effect for differences be- 
tween first (mean = 18.0) and second administrations 
[mean = 23.6, F(l,82) = 169, p < .001; eta 2 = .673]; there 
was no evidence of interaction in these results. 

Similar analyses were made for students above and 
below the median GPA in humanities courses (Art, History, 
Music, Philosophy, World Civilization, World Humanities, 
and World Arts). Differences between high (mean = 49.4) 
and low humanities GPA groups (mean - 45.3) were also 
significant [JF( 1,81) = 7.96, p < .01; eta 2 = .089], as were 
differences between the first (mean = 23.0) and second ad- 
ministrations [mean = 24.5, F( 1,8 1 ) = 9.94, p < .001; eta 2 
= .109]; there was no interaction. The same type of analysis 
was also computed for the number correct on the first and 
second vocabulary tests; again it revealed somewhat 
smaller differences between the high- (mean = 43.1) and 
low-GPA groups [mean = 39.0, F(l,81) = 4.18,p < .05; eta 2 
= .049] and larger differences between the first (mean = 
17.8) and second administrations [mean = 23.4, F(l,81) = 
179.2, p < .001; eta 2 = .689] than the results for knowledge 
monitoring scores. There were no significant differences be- 
tween the science or social science GPA groups using either 
the knowledge monitoring scores or the raw scores. 

The relationships between metacognitive scores and 
GPA were generally similar to those reported in Study III, 
supporting the predictive validity of the KMA scores. In 
contrast with the prior study, in which both KMA and 
raw scores had fairly similar patterns of relationship, the 
metacognitive scores had a significant effect on overall 
GPA, whereas the raw scores did not. Furthermore, the 
KMA scores accounted for more variance between groups 
than did the number correct on the vocabulary test in two 
of three other comparisons, supporting the construct va- 
lidity of the KMA procedure. 

Several factors are likely to have reduced the magni- 
tude of the effects and the generalizability of the results to 
other groups of college students. As in the first study, par- 
ticipants in the pre-freshman program were considered to 
be at risk of poor performance in college. This may have 
reduced the range of achievement for the sample and, 
therefore, may also have reduced the differences in knowl- 
edge monitoring ability between the groups. Furthermore, 
even though data were not collected in sections of the pre- 
freshman program devoted exclusively to English as a 
Second Language (ESL), some of the students were en- 
rolled in both ESL and other sections, and thus ended up 
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as parr of the sample. The presence of non-native English 
speakers may also have reduced the variability among par- 
ticipants and narrowed group differences. Further research 
limited to native English speakers, who are more hetero- 
geneous in terms of academic skills than was the present 
sample, is needed to determine whether knowledge moni- 
toring ability differences between low- and high-achieving 
students are greater than those reported here. 

In general, KMA scores seemed to more successfully 
differentiate the capable students, whose grades were 
above the median, from those less able than did the raw 
scores, replicating the findings of Studies I and III. The 
knowledge monitoring scores accounted for anywhere 
from 1 to 4 percent more variance than did similar 
analyses using the raw scores. It was also interesting that 
the analysis of differences in raw scores between the first 
and second vocabulary test administrations always ac- 
counted for substantially more variance than did a similar 
analysis based on knowledge monitoring scores. The 
latter finding is reasonable and supports the construct va- 
lidity of the KMA procedure in that most students learned 
some new words from reading the passage, though their 
knowledge monitoring ability was not equally enhanced. 
However, it should be noted that the results for English 
grades indicated that there were greater increases in 
knowledge monitoring ability for capable students than 
for their less able peers (see Figure 4). These findings sug- 
gest that while all students increased both their 
demonstrated knowledge and their knowledge monitoring 
ability from first to second test administration, the in- 
creases in monitoring ability were greater for more 
capable students (i.e., those whose English grades were 
above the median). Apparently there was a greater degree 
of improvement in such students’ metacognitive skills 
than in those of their less able colleagues. 

It should be noted that many of the students in this 
sample took less than a full-time schedule of courses. This 
fact is likely to have decreased the reliability of the GPA, 
because it was based on fewer courses and credits than is 
usually the case after a year of college. This may also limit 
the generalizability of the results to other groups of stu- 
dents. Therefore, to increase both the reliability and 
variability of this criterion, it would be useful to investi- 
gate the predictive validity of the KMA procedure for a 
large number of full-time students. 

Study VI Knowledge Monitoring Ability and 
Learning among Vocational High School 
Students 4 

College students were used as subjects in all of the pre- 
vious studies. Individuals attending college are likely to be 
more studious and academically oriented than are students 
at secondary levels and therefore more likely to be reflective 
about what they know and do not know. Thus, one purpose 


of Study V was to examine the applicability of the KMA 
procedure to students attending a vocational high school. 

Participants and Procedures 

All of the participants attended a vocational high 
school in a large urban school system. A total of 61 stu- 
dents (59 male) participated in this study. The students’ 
ages ranged from 1 6—19. This study employed the word 
list and vocabulary test described in the two preceding 
studies; the reading passage was not administered. Stu- 
dents were tested during one of their regular school 
classes. In addition, test anxiety scales were administered 
and students were asked to estimate their grades on tests 
given in one of their vocational classes. Students’ overall 
GPAs were obtained from the school’s permanent records. 

Results and Discussion 

Students were divided into two groups at the GPA me- 
dian. Two multivariate analyses of variance (MANOVA) 
were computed: the first examined differences between the 
high- and low-GPA groups in terms of the accuracy of stu- 
dents’ estimates of their knowledge (using the ++, + — , 
and — scores) and the second analysis examined group 
differences in students’ word knowledge (the sum of + + 
and — + scores equalled the number correct on the vocab- 
ulary test). The MANOVA indicated that the overall 
differences in knowledge monitoring ability between the 
high- and low-GPA groups were significant [Wilks F(3,5 7) 
= 3.17, p < .05, effect size = .143]. Univariate analyses 
showed that only the difference between the high- (mean 
= 17.8) and low- (mean = 14.4) GPA groups on the + + 
scores was significant [F(l,59) = 9.35, p < .01]. 

The MANOVA computed on group differences in the 
number correct on the vocabulary test also indicated a sig- 
nificant difference between the groups [Wilks F( 2,58) = 
5.35, p < .01, effect size = .156]. Univariate analyses 
showed that the differences in + 4- scores were the same 
as in the preceding analysis; however, in this analysis, 
group differences between the high- (mean = 3.5) and low- 
(mean = 5.2) GPA groups on the — + scores were also sig- 
nificant [F(2,58) = 5.59, p < .05]. As expected, the results 
indicated that the capable students estimated that they 
knew and actually knew more words than did those with 
lower GPAs, while the latter group estimated that they did 
not know more words than did the students who were 
above the GPA median. 

The significant differences between the two GPA 
groups replicated the results of the two prior studies and 
confirm the relationships between metacognitive knowl- 
edge estimates and classroom learning. The results of the 
second analysis do not support the additional importance 
of obtaining students’ estimates of their knowledge, be- 
cause the differences between the GPA groups in their 
actual vocabulary knowledge were also significant, and 
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slightly greater than the differences in knowledge esti- 
mates. However, the word list and vocabulary test were 
developed for a college population; perhaps these mate- 
rials were so unfamiliar to these high school students that 
their estimates were based on little more than chance. 

Study VI. Knowledge Monitoring Ability among 
High School Dropouts 5 

The high percentage of students who drop out of school 
is a major problem, especially in times when even most 
entry-level positions in business and industry call for 
greater levels of skill than ever before. At a time when the 
advent of the information superhighway is beginning to re- 
define the job functions of lower- and mid-level workers, it 
is vital that students complete a secondary school educa- 
tion in order to have better chances of finding employment. 
Nevertheless, there are indications that the number of 
school dropouts is increasing. “A national estimate sug- 
gests that 25 percent of fifth graders will not make it 
through high school graduation” (Mann, 1986, p. 309). 

There are many reasons for students’ dropping out of 
school, but Tanner (1990) suggests that “School-based 
reasons are the most important self-reported explanation 
of dropping out for all groups of adolescents” (p. 80). 
Chief among these is poor performance in school. When 
asked why they had dropped out of school, more than 
one-third of the students said, “Because I had bad grades” 
or “because I did not like school” (Mann, 1986, p. 309). 
These findings are substantially similar to those reported 
by Ekstrom, Goertz, Pollack, & Rock (1986). It therefore 
seemed reasonable to assume that students who drop out 
of high school would have lower metacognitive knowl- 
edge monitoring abilities than other students. This 
assumption was examined in Study VI. 

Participants and Procedures 

The word list and vocabulary test employed in Studies 
II to V were administered, together with some test anxiety 
scales that will be described later in this report. The 
reading passage was not used. 

A total of 89 subjects participated. The dropout group 
consisted of 42 individuals (14 female) who were at- 
tending a General Equivalency Diploma program. A 
group of currently enrolled students (47 total, 16 female) 
who had a school GPA of at least B- also participated. 
None of the students in the latter group had given any in- 
dication that they were at risk of dropping out of school. 

Results and Discussion 

Two MANOVAs, identical to those computed in the 
preceding study, were used to determine the significance 
of differences between the high school dropouts and con- 
tinuing students. The first analysis found significant 
overall group differences [Wilks f(3, 79) = 4.08, p > .01, 


effect size = .134] in KMA scores ( + +, +— , and — 
scores). Univariate analyses indicated that the dropout 
group (mean = 12.7) differed from the continuing students 
(mean = 16.2) on the ++ scores [T(l,81) = 8.83, p < .01] 
and on the +— scores (mean = dropout 10.6, continuing 
students 8.5); ff(l,81) = 6.11, p < .02], A similar analysis 
of actual knowledge ( + + and — + scores) also indicated 
significant, though somewhat smaller, group differences 
[Wilks f(2, 80) = 4.61, p < .01, effect size = .103], Uni- 
variate analyses indicated that only the difference in the 
+ 4- scores was significant; of course, the statistics for this 
effect were identical to those used in the first MANOVA. 

The results indicate that, as expected, students who 
dropped out of high school had less adequate knowledge 
monitoring abilities than continuing students. Analysis of 
differences in raw scores revealed similar, though some- 
what smaller, effects. The results suggest that the limited 
knowledge monitoring abilities of students who dropped 
out of school may have made schoolwork more difficult 
for them and contributed to poor performance, which is 
consistent with the descriptions of school dropouts in the 
literature. 

Summary: Knowledge Monitoring Ability and 
Classroom Learning 

As expected, the four studies discussed in this section 
found significant relationships between metacognitive 
knowledge monitoring scores and classroom learning. The 
studies used different student samples: enrolled college stu- 
dents, those about to enter college and enrolled in a 
pre-freshman skills program, vocational and regular high 
school students, and those who had dropped out of 
school. Because relationships with knowledge monitoring 
ability were in the expected direction for the different sam- 
ples, it may be inferred that the KMA is generalizable 
across a variety of student groups. In most of the studies, 
the KMA scores accounted for more variance than did the 
raw vocabulary scores, supporting the construct validity of 
the KMA procedure. 

Prediction of Performance and 
Metacognitive Knowledge 
Monitoring Ability 

It was reasoned that students who were capable of ac- 
curately estimating their knowledge of vocabulary on the 
KMA should also be more accurate in predicting their per- 
formance on examinations related to their current studies. 
This section will describe three investigations examining 
this assumption. 

There has been some research on students’ predictions 


15 



of their performance in courses and on tests, though none 
relating the predictions to metacognition or knowledge 
monitoring ability. Keefer (1971) found that college stu- 
dents who accurately estimated their performance 
achieved at a significantly higher level than did those who 
estimated less accurately and had a more positive self-con- 
cept than their low-estimating counterparts. Holen and 
Newhouse (1976) found that students’ predictions of their 
grades on a course examination correlated as highly with 
actual performance as did their GPAs and were signifi- 
cantly more accurate predictors of performance than 
other variables, such as grades in prerequisite courses or 
GPAs. Furthermore, students’ performance predictions 
contributed significant unique variance to actual final 
grade, above that contributed by high school and college 
GPA, or grades in prerequisite courses in predicting that 
grade. Harris (1990) found that students who were accu- 
rate estimators of their test performance in psychology 
earned a significantly higher final grade in Introductory 
Psychology than did less-accurate estimators. 

The research on prediction of performance suggests that 
more capable students make more accurate predictions of 
their performance than do their less able counterparts. Be- 
cause the studies described in the preceding section found 
that higher KMA scores were associated with higher GPAs, 
the findings dealing with predictions of performance sug- 
gest that students who make accurate metacognitive 
assessments of their knowledge should also make more ac- 
curate predictions of their test scores. 

Study VII. Estimates of Performance and 

Predicting Scores on Standardized Tests 

In addition to describing the relationship between stu- 
dents’ estimation of and actual performance on tests, this 
study also varied the reading passage used, to examine its 
contribution to students’ estimates of their test perfor- 
mance. Furthermore, it was decided to examine 
performance on a standardized test of known reliability to 
reduce possible error. Studies I and II used a standardized 
measure of reading comprehension (the DTLS, College 
Board, 1979) as the criterion and the results relating test 
performance to KMA were more positive than were the 
results where less-reliable student grades were used. 
Therefore, a test with known reliability (.88) was used in 
this study. 

It was expected that Introductory Psychology students 
who could accurately monitor their knowledge would 
also be more accurate in predicting their actual and esti- 
mated scores on the Advanced Placement (AP) Test in 
Psychology (College Board, 1988) before and after com- 
pleting it, and that they would also earn higher scores on 
the AP test than would their peers who estimated less ac- 
curately. Finally, as suggested by other studies of students’ 
ability to estimate their performance, it was predicted that 


students with high KMA scores would expect to obtain 
higher grades than those with lower scores. 

Participants and Procedures 

A total of 77 students (41 female) taking an Introduc- 
tory Psychology course at one of the campuses of a large 
urban university volunteered to participate in the study. 
Participation in this study satisfied a course requirement. 

The AP Examination in Psychology (College Board, 
1988) was administered to students enrolled in an Intro- 
ductory Psychology course. Students received a 
description of the different areas covered by the AP test 
and were asked to predict how many of the 100 items they 
would be able to answer correctly before they took the test 
and again after the test was completed. Half of the sample 
(N = 39) was randomly assigned to read the expository 
version of the test passage used in two of the studies de- 
scribed in the earlier section, while the other half (N = 38) 
performed an unrelated task, reading the text selection ti- 
tled “Teaching the Mentally Retarded” from the Sentence 
Verification Technique (SVT) (Royer, Carlo, Dufresne, & 
Mestre, 1994) and answering questions on that passage. 
The same word list and vocabulary test used in Studies II 
to VI were then administered to all participants. 

Students were also asked to predict their final grade in 
the Introductory Psychology course they were taking. On 
this campus, the accuracy of their grade predictions could 
not be determined because regulations protecting students’ 
privacy made it impossible to obtain that information. 

Results and Discussion 

More accurate KMA scores were expected for the 
group responding to the word list and vocabulary test 
after reading the passage compared to the other group 
who received the SVT, which was irrelevant to the task. 



Figure 5. Differences between reading passage and SVT groups on knowl- 
edge monitoring ability scale. 
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Surprisingly, MANOVA based on the total number of ac- 
curate estimates ( + + and — ) revealed no significant 
differences between the groups. (See Figure 5.) Examina- 
tion of the four basic KMA scores for both explicitly and 
implicitly defined words indicated that there were some 
group differences (see Figure 6), but that these differences 
were canceled out when the data were combined into total 
number of correct estimates. 



Figure 6. Differences between reading passage and SVT groups on raw 
scores for explicitly and implicidy defined words. 

When a MANOVA was performed on six of the basic 
scores (the scores for the +— category for explicitly and 
implicitly defined words were eliminated to reduce 
collinearity), the overall differences between the groups 
were significant [F(6, 70) = 3.71, p < .01]. Univariate F 
tests indicated that the students who read the passage 
made more accurate metacognitive estimates on explicitly 
defined words in the + + category [F(l,75) = 5.97, p < 

.02.] and had fewer explicitly defined words in the 

category [F(l,75) = 4.74, p < .05]. 

Student groups were divided at the median on total 
number of accurate metacognitive estimates (combining 
the + + and — categories). A MANOVA was performed 
to examine the significance of the differences on students’ 
predictions of their AP scores before and after they took 
the test, their actual AP scores, and their expected final 
grades in the Introductory Psychology class. There were 
no differences on the AP test data or on the expected final 
grades between the groups who read either the passage or 
the SVT [F(4,69) < 1], but differences between the high 
and low groups were significant [F(4,69) = 2.83, p < .05, 
effect size = .141]. Univariate tests indicated that the high 
knowledge monitoring group obtained higher AP scores 
[mean = 43.6, F(l,72) = 7.81, p < .01], and that differ- 
ences in expected final grade in the course were just short 
of significance [F(l,72) = 3.40, p < .10]. There was no in- 
teraction between the groups who read either the passage 
or the SVT and the knowledge monitoring groups. 

The data were also analyzed with respect to the number 
correct on the vocabulary test, dividing the groups at the 


median and computing the significance of differences in re- 
lation to AP test scores and final course grades. The results 
were similar in that there were no differences between the 
groups who had read the passage or the SVT, but there 
was a significant difference between the groups above and 
below the median on pre-passage KMA score [F(4,69) = 
6.47, p < .01, effect size = .27], Univariate analysis again 
indicated only one significant difference on actual AP 
scores between the groups above (mean = 45.4) and below 
(mean = 34.2) the median on the vocabulary test score. 
Again there was no interaction among the variables. Un- 
like the prior studies, where differences in metacognitive 
knowledge estimates were usually greater than those on 
the vocabulary test raw score, the effect size for these data 
was larger using the vocabulary test results than the 
knowledge monitoring data (.27 compared to .14). 

The results indicate that students with a high score on 
the vocabulary test and with high ability to monitor their 
word knowledge also obtained higher scores on the AP 
exam and, at a marginally significant level, expected 
higher final grades in the course. The absence of group dif- 
ferences on AP score predictions before taking the test was 
not surprising because students were unfamiliar with the 
test. Beyond being informed about the categories of 
knowledge covered, they had no information about the 
difficulty of the items, the types of preparation possible for 
the test, or specifically what they would be questioned on. 
The absence of differences on students’ predictions after 
taking the test was a little more surprising, because par- 
ticipants now had a much clearer idea about what the test 
covered. Perhaps this brief exposure to the test was inad- 
equate to familiarize them with the domain covered by the 
AP examination. 

Study VIII. Knowledge Monitoring Ability and 
Estimates of Academic Achievement 

Ideally, of course, students’ predictions of their perfor- 
mance in courses for which they were registered should 
have been studied. As is not the case with the AP test, stu- 
dents should have enough information to make fairly 
accurate predictions about their final grades in courses 
based on their experience in the class and with the subject 
matter, the instructor, and the procedures of the course. 
This study was intended to examine this assumption, in 
addition to attempting to replicate the findings from the 
AP study. 

Participants and Procedures 

The procedures were identical to those in the previous 
study with two exceptions: first, the predictions students 
made about their final grades were compared to the actual 
final grades obtained in the course; second, students took 12 
quizzes in this class. (The instructor used the 10 highest quiz 
scores to determine the final grade.) The grades on these 
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quizzes were available as additional dependent variables. 

A total of 75 college students enrolled in Introductory 
Psychology participated in this study. The students re- 
ceived extra credit for taking part in the research. 

Results and Discussion 

The first set of analyses was computed to examine the 
consistency between the findings in this study and the pre- 
ceding one. As in the prior study, a test for the significance 
of differences between the group who read the passage 
and the SVT on the ++, + — , and — scores revealed no 
differences between the groups. When the component 
scores based on explicitly and implicitly defined words 
were examined, overall differences between the groups 
were significant [F(6, 68) = 2.57, p < .05]. Univariate 
analysis indicated that the group reading the passage had 

fewer scores for the explicitly defined words [F(l,73) 

= 7.69, p < .01] and more + + scores for the implicitly de- 
fined words [F(l,73) = 7.29, p < .01]. These results are 
identical to those in the preceding study and suggest that 
combining the data may have obscured existing group dif- 
ferences. Both sets of results point to the importance of 
conducting a study specifically designed to determine 
which set of data is the best indicator of the latent knowl- 
edge monitoring ability variable. 

The analysis of differences between groups scoring 
high and low on knowledge monitoring ability in pre- 
dicted before, after, and actual AP test scores, and final 
grades was also similar to that in the preceding study with 
one addition: students’ actual final grades in the course 
were available as an additional dependent variable. Two 
groups were created by dividing students at the median on 
the total number of accurate estimates of vocabulary 
knowledge and computing a MANOVA to examine the 
significance of differences on the AP and grade data; nine 
students were eliminated due to missing information. No 
differences between the groups who read either the pas- 
sage or the SVT were found [F(5,5 8) = 1.37]. In contrast 
with the prior study, the differences between the two 
groups in knowledge monitoring ability only approached 
significance [Wilks F(5, 58) = 2.21, p = .066; effect size = 
.16]. Univariate analysis indicated rhat the high knowl- 
edge monitoring group had significantly higher AP scores 
(mean = 45.2) than did the lower group [mean = 36.7; 
F(l,62) = 10.02, p = < .01]; there were no differences on 
predicted score either before or after the AP exam was 
raken, or on predicted and actual final grades. 

The finding that the high and low knowledge moni- 
toring ability groups differed only on actual AP test 
performance, rather than on any of the predictions, also 
replicated that of the prior study. The failure to find dif- 
ferences on the final-grades variable may have been a 
function of the limited range of the grades; A-D grades 
(there were no F grades in this sample) were converted to 


their numerical equivalents, yielding only four scores. Fur- 
thermore, 76 percent of the grades were B or higher, 
further limiting their variability. The interaction between 
knowledge monitoring ability groups and those who read 
either the passage or the SVT was of borderline signifi- 
cance [F(5,58) = 2.18, p = .069], principally attributable 
to the fact that the low-ability group's estimates of their 
AP scores and their final course grades were actually 
higher than those of the high-ability group, while the ac- 
tual scores and grades of the former group were lower 
than those of the latter. 

An identical MANOVA was performed with students 
divided at the median on the number of words correct on 
the vocabulary test as the independent variable. There were 
highly significant differences between the groups [Wilks 
F(5,5 8) = 5.70, p < .001; effect size = .33]. Univariate 
analysis indicated that the group scoring high on the vo- 
cabulary test also had higher AP scores [mean = 47.0, 
F( 1,62) = 22.89, p < .001] than the group scoring low 
(mean = 35.1). Contrary to the analysis based on predic- 
tions of knowledge, with this analysis, the high-scoring 
group also received higher final grades [mean = 90.4, 
F(l,62) = 5.24, p < .05] than did the low-scoring group 
(mean = 85). Again, the interaction between the groups 
who read either the passage or the SVT and the vocabulary 
test results approached significance [Wilks F( 5,58) = 2.12, 
p = .076], attributable to the low-scoring group predicting 
higher AP scores both before and after, and final grades, 
while actually obtaining lower scores on all three measures. 

The second set of analyses examined the relationship 
between knowledge monitoring scores and indices of in- 
class student performance, such as scores on quizzes and 
on the essay and multiple-choice parts of the final exami- 
nation. Because the instructor informed students that only 
the 10 highest scores on the 12 quizzes given would count 
for the final grade, many students missed some quizzes. 
Therefore, for students taking at least 10 of the quizzes, 
the mean score on all the quizzes taken was used as one of 
the dependent variables. Student groups were then divided 
at the median on the knowledge monitoring scores and a 
MANOVA was performed on the quiz and final exami- 
nation data; missing data limited this analysis to 70 
students. 

There were no significant differences on the class per- 
formance indices between the groups taking the SVT or 
reading the passage [F(4,63) = 1.04]. There was an overall 
significant difference between the high- and low-knowledge 
monitoring ability groups [Wilks F( 3,64) = 4.36, p = < .01], 
effect size = . 1 7|. Univariate analyses indicated that the 
high-ability group had significantly higher scores on the 
multiple-choice part of the final examination (mean = 25. 1 ) 
as compared to the low-ability group [mean = 21.2, F(l,66) 
= 12.6 6, p < .01]. Differences between the groups on mean 
quiz scores were of borderline significance [F( 1,66) = 3.02, 
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p = .09], with the high-ability group getting better scores 
(mean = 4.51) than the low-ability group (mean = 4.1). 
(Each quiz had a total of six raw score points.) There was 
no interaction between knowledge monitoring ability and 
whether groups read the passage or the SVT. 

An identical analysis was performed with students di- 
vided at the median on the number of words correct on 
the vocabulary test as the independent variable. There 
were overall differences for the high- and low-scoring 
groups [Wilks F(3,6 4) = 6.44, p = < .01, effect size = .232]. 
The high-scoring group had significantly higher scores on 
both the essay [mean = 17.2, F( 1,66) = 7.44, p < .01] and 
multiple-choice [mean = 25.5, F{\.66) = 18.72, p < 0.01] 
parts of the final exam and on the mean for the quizzes 
[mean = 4.6, JF(1,66) = 7.13, p = .01] than did the lower- 
scoring group (means = 14.5, 20.9, and 4.0, respectively). 
In this study as in the preceding one, vocabulary score was 
more effective than knowledge monitoring ability in terms 
of differentiating between students on AP and final grades 
(.33 effect size vs. .16) and on classroom tests (.232 com- 
pared to .17 effect size). 

Knowledge Monitoring Ability and Estimates of 
Performance among Vocational High School 
Students 

In Study V, examining the relationship between knowl- 
edge monitoring ability and classroom learning among 
vocational high school students, the participants were also 
asked to predict their grades on a final course examination 
both prior to and after taking it; the actual score on that test 
was available as a dependent measure. MANOVAs indi- 
cated that neither the metacognitive knowledge monitoring 
estimates nor the raw scores on the vocabulary test were sig- 
nificantly related to either the predicted or actual grades. The 
failure to find any differences is at variance with the findings 
of the two preceding studies involving college students. 

There are a number of differences between the studies 
involving vocational high school and college students, in 
addition to the population differences, that may account 
for the diverse findings. The vocational high school stu- 
dents were asked to predict their performance on a final 
exam in the class they were taking and presumably had a 
much better idea of the content of the exam and how to 
prepare for it than did the college psychology students 
who had very little basis for knowing what to expect on 
the AP test and could not prepare for it at all. Further- 
more, given that the vocational students had been graded 
on other exams in that class, they — unlike the college stu- 
dents — may have known what grade to expect based on 
their prior performance. Thus, prior experience may have 
been more important than either their declarative knowl- 
edge or their metacognitive knowledge monitoring ability 
in determining the high school students’ estimates. 


Summary: Estimates of Performance and Knowledge 
Monitoring Ability 

One striking finding of two of the studies involving col- 
lege students was that in terms of knowledge monitoring 
ability, the strongest effects were found for students' actual 
performance, either on tests or in class, rather than their es- 
timates. Students’ estimated performance on the AP exam, 
or their predicted achievement in class, was typically not 
significantly related to the KMA scores. On the other hand, 
performance on the AP test or on final exams (at least the 
multiple-choice part of the exam in Study VIII) was signif- 
icantly related to knowledge monitoring ability. These 
results may be partially attributable to unrealistic estimates 
of students in the lower knowledge monitoring ability 
groups. 

There was a large difference between the accuracy of 
vocational high school students and college students in 
predicting test performance. The correlations between 
predicted and actual scores for the vocational students 
were .71 and .75 (p < .001) after they took the test; com- 
parable results for college students in Study VII were .13 
and .16, both nonsignificant, and for Study VIII they were 
-.14 and -.12, also nonsignificant. The greater accuracy of 
the vocational students is probably attributable to their fa- 
miliarity with the material they were tested on, compared 
to the unfamiliar content of the AP test for the two college 
samples. As expected, the relationships between predic- 
tion and performance were higher, though not significantly 
so, after students took the tests, when they knew what was 
covered. 

In both of the studies involving college students, the 
analysis of academic performance based on actual word 
knowledge (number correct on the vocabulary test) ac- 
counted for more variance than did comparable analyses 
using the more conclusive KMA scores. It seems possible 
that students’ achievement in class is best predicted by ac- 
tual word knowledge rather than their estimates of it. 
Furthermore, in view of the nonsignificant relationships 
for the high school sample between either actual word 
knowledge or their KMA scores and final course exam re- 
sults, it seems likely that domain-specific knowledge may 
be most useful for predicting course performance. 

An important question to investigate is whether esti- 
mated knowledge or demonstrated knowledge on tests in 
the domain which is the subject of instruction and evalu- 
ation are likely to account for more variance than similar 
indices based on fairly general materials, such as those 
used in the studies reviewed here. The prior research as- 
sumed that the word list, vocabulary test, and reading 
passage were similar to the kinds of material students 
would be exposed to in nontechnical areas of instruction. 
The studies relating knowledge monitoring ability to 
classroom learning found KMA relationships with 
achievement in English and humanities courses, but not in 
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the sciences and social sciences. These results suggest that 
general knowledge, or metacognitive estimates of that 
knowledge, are less useful in more technical areas that rely 
on a domain-specific technical vocabulary than they are in 
subjects that have a more widely shared knowledge base 
and vocabulary. 

Metacognitive Knowledge 
Monitoring Ability and 
Mathematics 

All of the studies described so far used the KMA in the 
domain of declarative word knowledge and employed 
similar or identical versions of the study materials. There- 
fore, the question arises whether the procedure can be 
generalized to other domains, such as mathematics. As is 
true of vocabulary, mathematics is of special interest be- 
cause it is also important in classroom learning. However, 
computation and problem solving in mathematics involve 
more procedural knowledge than does learning vocabu- 
lary. Thus, one purpose of the two studies described 
below was to examine the applicability of the KMA pro- 
cedure to the domain of procedural knowledge in 
mathematics. 

The research described above involved relatively ma- 
ture students, predominantly those attending college; only 
two investigations studied high school students. A further 
question to be examined in the next two studies was 
whether the KMA was equally appropriate to examining 
learning among younger, elementary school students. 

Study DC. Mathematical Problem Solving among 
Elementary School Students 6 

Van Haneghan and Baker (1989) reported on a 
number of investigations of the effects of metacognition 
on the accuracy of problem representation in mathe- 
matics. The results indicated that metacognition was as 
important for learning mathematics as it was for reading. 
These findings are supported by other researchers, such as 
Campione, Brown, and Connell (1989), Lester, Garofalo, 
and Kroll (1989), and Schoenfeld (1992). Additional re- 
search (Cardell-Elawar, 1992; Montague, 1992) has 
shown that students’ performance in solving mathemat- 
ical problems was facilitated when they were taught a 
metacognitive approach. Therefore, in the studies below, 
it was expected that procedural KMA scores in mathe- 
matics should be related to overall achievement in that 
subject. 

Participants and Procedures 

A set of 30 mathematical questions was constructed 


(20 computation and 10 problem-solving items); the items 
were selected from the fifth-grade mathematics cur- 
riculum. Students were first asked to take six minutes to 
determine whether “you feel able to solve these problems. 
Do not solve them now,” giving them an average of 12 
seconds per problem. In a later session, the same 30 ques- 
tions were used, and students were given 40 minutes to 
actually solve the problems. A number of anxiety scales 
were also administered. 

A total of 51 fifth-grade students (31 female) from an 
urban public school served as subjects in this study. The 
students were predominantly of Hispanic origin and their 
reading and mathematical achievement ranged from av- 
erage for their grade to two years below grade level. 

Results and Discussion 

Scoring of the mathematics KMA was similar to the 
procedure used for the word knowledge KMA scores re- 
ported earlier and involved generating four scores: 
Students felt that they could: 

1) solve a problem and did so ( + + ), 

2) not solve a problem and did not ( — ), 

3) solve a problem, but did not ( H — ), and 

4) not solve a problem, but did ( — + ). 

The results dealing with anxiety will be discussed later in 
this report. 

There were no differences among students’ metacogni- 
tive estimates of ability attributable to gender, so these data 
were pooled for further analysis. The knowledge moni- 
toring scores were correlated with the total math score on 
the Metropolitan Achievement Test (1985) obtained from 
the students’ records. The correlations are displayed in 
Table 4. The last row in that table represents the number 
correct on the math test. The ++ and — scores were 
combined to indicate students’ correct estimates of their 


Table 4 


Correlations Between Knowledge Monitoring Scores and 
Achievement in Mathematics* 

Knowledge Monitoring Score 

Correlation 

+ + 

.73’’'** 


-.43** 


-.65*** 

-- 

-.11 

+ + and — 

-.76*** 

-+ and + — 

-.72*** 

* The correlation berween the Knowledge Monitoring raw score (i.e. total 
number correct) and performance on the Metropolitan Achievement Test 
in Math was .52**. 


*■ /><.01 
p < .001 
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ability to solve mathematical problems, and the — + and 
+ — scores were added to show the incorrect estimates. 

Table 4 indicates that three of the four estimates were 
significantly related to students’ achievement in mathe- 
matics. The correlation between the number correct on 
the math test and the Metropolitan test score was .52. 
When that relationship is compared to the correlation of 
.73 between the Metropolitan test score and ++, or the 
correlation of .76 between the Metropolitan test score 
and total number of correct estimates, it is clear that 
metacognitive estimates of ability to answer the questions 
are more substantially related to mathematical achieve- 
ment than is the number of problems solved correctly, 
irrespective of estimate. That finding was confirmed by 
regression analysis. When the number of correct esti- 
mates, incorrect estimates, and total number right were 
used in the regression, only the correct estimates con- 
tributed significantly to prediction of the Metropolitan 
test score [ R 2 Change = .08, F(3,45) = 8.52, p < .01]. 
These results confirm the basic assumption that students’ 
metacognitive estimates of their ability contribute signifi- 
cant independent variance beyond that accounted for by 
the number correct on a test. 

The results support predictions regarding the relation- 
ships between the procedural KMA in mathematics and 
achievement in that domain. As expected, there were sub- 
stantial correlations between students’ estimates of their 
ability to solve mathematical problems and their achieve- 
ment in mathematics. Also as expected, inaccurate 
estimates were negatively related to achievement. While 
no causal inferences about the relationship between math- 
ematical achievement and the ability to monitor 
knowledge can be made from these correlational data, the 
fact that the variables co-vary as expected supports the 
generalizability of the KMA procedure and suggests that 
the technique is useful for further research on achievement 
in mathematics. 

Study X. Relationship of Knowledge Monitoring 
Ability in Mathematics to Age and Achievement 7 

The prior study provided encouraging evidence of the 
applicability of knowledge monitoring ability to achieve- 
ment in mathematics. Furthermore, the results of Study IX 
also indicated that the KMA could be used in assessing el- 
ementary school students. Because metacognition is often 
viewed as a developed ability and assumed to increase 
with age, one purpose of this study was to investigate 
whether procedural knowledge monitoring ability in 
mathematics would also increase with age. The preceding 
study indicated a high relationship between KMA scores 
in mathematics and achievement test scores in that do- 
main. Study X examined whether knowledge monitoring 
scores were related to teachers’ judgments of mathemat- 
ical ability. 


Participants and Procedures 

Students ( N = 164, 70 female) were selected from the 
fourth, fifth, and sixth grades of a school attended largely 
by minority students. Mathematical ability was determined 
by teachers’ judgments; 29 students were placed in the low-, 
93 in the medium-, and 42 in the high-ability groups. 

Students were presented with 15 mathematical word 
problems involving addition and subtraction. The prob- 
lems were set in the context of an ice cream store and 
students received a menu of prices for different products 
that were referred to in the problems. The materials were 
prepared in two versions presumed to elicit varying levels 
of interest among students. The results dealing with in- 
terest will be discussed later in this report. The materials 
were administered on two days during regular class pe- 
riods. On the first day, students examined the problems 
and estimated whether they could solve them or not; on the 
second day, the students were asked to solve the problems. 

Results and Discussion 

Students’ responses were assigned a score of 1 for each 
correct estimate (combining the ++ and — scores) and 
0 for each incorrect estimate (combining the + - and - + 
scores). Due to a computer malfunction, raw data were 
not available for rescoring in the format used in the other 
studies. The data were then submitted to a 3 (grades) x 2 
(gender) x 2 (group: control versus interest, see below) x 
3 (math ability) analysis of variance. 

As expected, there was a significant increase in knowl- 
edge monitoring scores from grades four to six ( F = 34.66, 
df = 2, 144, p < .001, eta 2 = .26; see Figure 7 for a plot of 
the data). Also as expected, knowledge monitoring scores 
increased with mathematical ability (F = 15.25, df = 2, 144, 



Figure 7. Relationships between mathematical knowledge monitoring 
scores and grade level. 
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p < .001, eta 2 = .18; see Figure 8 for a plot of the data). 
These results offer further support for the construct validity 
of the KMA procedure in that older or more capable stu- 
dents were expected to have higher knowledge monitoring 
ability than their younger, less capable counterparts. There 
were no significant differences attributable to gender. 



Figure 8. Relationships between mathematical knowledge monitoring 
scores and math achievement group. 

Summary: Knowledge Monitoring Ability and 
Achievement in Mathematics 

The results of Studies IX and X were quite positive re- 
garding the applicability of the KMA to mathematics. The 
relationship of knowledge monitoring scores to achieve- 
ment in mathematics in Study X is similar to the 
correlations with math achievement test scores reported in 
Study IX and both indicate strong relationships between 
metacognitive knowledge monitoring ability and achieve- 
ment in mathematics. The increases in ability associated 
with age reported in Study X also support that relation- 
ship. Furthermore, because most of the items used in both 
mathematics studies involved procedural knowledge of 
the type needed to solve word problems, the results sug- 
gest that the KMA may be applicable to procedural 
knowledge as well as declarative word knowledge. 

Metacognitive Knowledge 
Monitoring Ability and Affect 

The paradigm shift to a cognitive orientation in psy- 
chology generated a great deal of research intended to 
clarify the cognitive processes controlling learning. How- 
ever, the impact of affective processes on learning has 
received considerably less attention (Tobias, 1992, 1994a, 


b). The research discussed in this section was intended to 
forge a link between affect and cognition by examining 
the influence of affective variables such as anxiety and in- 
terest on metacognitive knowledge monitoring ability. 

The Impact of Anxiety on Knowledge 
Monitoring Ability 

One affective variable that has been the subject of a 
great deal of research is anxiety and its impact on 
learning. In general, that research has suggested a negative 
relationship between different forms of anxiety and 
achievement (Tobias, 1992, Hembree, 1988). It has been 
suggested (Tobias 1985, 1992) that anxiety reduces the 
cognitive capacity available for task solution. The ca- 
pacity required by an executive process such as 
metacognitive knowledge monitoring was expected to be 
especially reduced among highly anxious students. There- 
fore, a negative relationship between anxiety and 
knowledge monitoring ability was anticipated because 
“highly test anxious students can be expected to have less 
adequate metacognitive abilities than those with lower 
anxiety” (Tobias, 1992, p. 28). 

Knowledge Monitoring Ability, Reading 
Comprehension, and Test Anxiety 

Study II also examined the relationship of the KMA 
procedure to anxiety. The worry subscale of the Test Anx- 
iety Inventory (Spielberger, et ah, 1980) was administered 
to the subjects. 

As expected, the more highly anxious participants per- 
formed less well on the KMA. Those with less anxiety 
achieved a significantly higher number of “hits” than 
those prone to higher levels of anxiety [t(l 15) = 4.92, p < 
.001], and in general the less anxious subjects had higher 
levels of metacognitive word knowledge as measured by 
d’, [t(115) = 4.07, p < .001], confirming the expected neg- 
ative relationships between knowledge monitoring ability 
and test anxiety. 

Knowledge Monitoring Ability in Mathematics 
and Anxiety 

Study II found the expected negative relationship be- 
tween knowledge monitoring scores and anxiety with 
respect to vocabulary. Study IX, in addition to investi- 
gating the extension of the KMA procedure to 
mathematics, also studied its relationship to both test and 
mathematics anxiety. 

As part of Study IX, the Fenema-Sherman (1976) scales 
assessing math anxiety and attitudes toward mathematics 
were administered to the participants (see the earlier de- 
scription of Study IX) in a first session. To ensure that the 



subjects could understand the questions, each item was 
read aloud while the students read the materials to them- 
selves. The Worry-Emotionality Scale (Morris, Davis, & 
Hutchings, 1981), a 10-item, Likert-tvpe measure of these 
components of test anxiety, was also administered. Stu- 
dents’ achievement in mathematics was determined from 
their scores on the Metropolitan Achievement Test (1985) 
obtained from school files. 

In Study IX no gender differences in the effects of anx- 
iety were found, so the data for all students were pooled. 
The relationships between knowledge monitoring ability 
and mathematics anxiety, as well as with worry and emo- 
tionality, are shown in Table 5. 

Table 5 


Correlations Between Knowledge Monitoring Scores and 
Anxiety in Mathematics 



Math 

Worry and 

Score 

Anxietv 

Emotionality 

+ + 

-.42* * 

-.22 

+ - 

.32* 

.25 

— + 

.38** 

.23 

— 

.00 

.20 

+ + and 

-.46** 

-.15 

- + and + - 

.46** 

-.33* 


* p < .05 

** p < .01 


Table 5 indicates that, as expected, mathematics anx- 
iety was negatively related to incorrect estimates of 
knowledge and positively related to correct ones. The neg- 
ative relationships between knowledge monitoring ability 
and anxiety are generally similar to those found in Study 
II, confirming expectations that anxious students have less 
ability to monitor their knowledge than their less anxious 
peers. 

Knowledge Monitoring Ability and 
Anxiety among High School 
Dropouts and Continuing Students 

Study VI investigated whether continuing students 
and high school dropouts differed in knowledge 
monitoring ability. An additional purpose of that study 
was to examine the differences in anxiety level between 
continuing students and high school dropouts, as well as 
the relationship between anxiety and metacognitive 
knowledge monitoring. In this study, the Test Anxiety 
Inventory (Spielberger et ah, 1980) was given to all par- 
ticipants, followed by two administrations of the Worry- 
Emotionality Scale (Morris et al., 1981). Initially, par- 


ticipants were asked to complete the Worry-Emotion- 
ality Scale in terms of how they felt while being tested in 
general; when the scale was readministered after the vo- 
cabulary test, students were asked to complete the scale 
in terms of how they felt while completing the vocabu- 
lary test. 

Surprisingly, the results of a MANOVA indicated that 
there were no differences in anxiety level between high 
school dropouts and continuing students on any of the 
seven anxiety scores (the three Test Anxiety Inventory 
scores: Worry, Emotionality, and Total, in addition to 
four Worry and Emotionality scores from each adminis- 
tration of those scales). That finding is puzzling in view of 
the reports in the literature that poor performance in 
school, and presumably on tests, is a major reason stu- 
dents drop out of high school. One explanation may lie in 
the problems to which self-report measures in general, 
and self-reports of test anxiety in particular, are subject. 
Students can easily minimize or deny indications of test 
anxiety when responding to these measures and present 
themselves as not caring about how well they might func- 
tion on tests. The KMA procedure, however, made it 
difficult for students to present themselves in a more fa- 
vorable light, and that may account for the findings of 
group differences in metacognitive knowledge monitoring 
ability and the absence of differences on measures of test 
anxiety. 

Most of the zero-order correlations between the KMA 
scores and the anxiety indices were negative, and a fair 
number were significant. Multiple linear regression 
analyses were computed with the KMA scores as the de- 
pendent variable and the anxiety scores as the 
independent variable. Results indicated that the anxiety 
scales had a significant impact only on the ++ scores [R 2 
= .25, (F( 7,72) = 3.43, p < .01]; significant beta weights 
were found for Emotionality on the Worry-Emotionality 
Scale taken after students had completed the vocabulary 
test (t = 2.74). The regression analysis also indicated that 
none of the other KMA scores was significantly related to 
the anxiety scales. In view of the number of anxiety and 
knowledge monitoring scores, the finding of significant re- 
lationships for some of them is not surprising. In general, 
however, the results of this study suggested that there was 
little association between metacognitive knowledge moni- 
toring ability and anxiety. 

Knowledge Monitoring Ability and 

Anxiety among Vocational High School Students 

Study V, which examined knowledge monitoring 
ability among vocational high school students, also inves- 
tigated the relationship between anxiety and knowledge 
monitoring ability, as well as between anxiety and 
achievement. In addition to relating metacognition to anx- 
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iety, it was expected that students with lower GPAs would 
have higher levels of anxiety than those with higher GPAs. 

The anxiety scales and the order in which they were 
administered in Study V were identical to those used in the 
study of high school dropouts (Study VI). The Worry- 
Emotionality Scale (Morris et al., 1981) was administered 
first and students were asked to respond in terms of how 
they felt while taking tests in general. The Test Anxiety In- 
ventory (Spielberger et al., 1980) was then given, followed 
by a second administration of the Worry-Emotionality 
Scale with instructions for students to respond in terms of 
how they felt while taking the vocabulary test. 

The significance of the differences in anxiety scores be- 
tween the participants in Study V above and below the 
median GPA was examined by performing a MANOVA. 
Surprisingly, there were no differences between the two 
GPA groups on any of the seven anxiety scores. Also, 
much as in Study VI, most of the zero-order correlations 
between knowledge monitoring ability and level of anx- 
iety were negative. Multiple linear regression analyses 
were then computed with the knowledge monitoring 
scores as the independent variable and the anxiety scores 
as the dependent variable. None of the regression equa- 
tions was significant for this sample. 

Summary: Knowledge Monitoring Ability and 
Anxiety 

The evidence regarding the relationship between level 
of anxiety and knowledge monitoring ability is mixed. 
Significant negative relationships were expected and 
found in two of the studies, one in mathematics and the 
other using vocabulary materials. On the other hand, two 
additional studies failed to find any evidence of differ- 
ences. There was a larger sample in Study II, which found 
significant negative relationships with level of anxiety 
using the vocabulary materials than in the studies in- 
volving vocational high school (Study V) or high school 
dropout (Study VI) groups. Because many of the test anx- 
iety-metacognitive knowledge monitoring relationships in 
the latter two studies were, as expected, in the negative di- 
rection, and because some of the regression analyses 
approached significance, further research with larger sam- 
ples is needed to clarify the relationship between anxiety 
and knowledge monitoring ability. The results of Study II 
suggest that knowledge monitoring ability and level of test 
anxiety each contributed to performance on less chal- 
lenging reading material. On more demanding material, 
however, test anxiety and knowledge monitoring ability 
appeared to interact to affect performance. The highly 
anxious examinee, regardless of metacognitive ability, 
performed less well on the more demanding reading tasks, 
suggesting that worrying can interfere with strategic use of 


metacognitive skills when tasks are cognitively de- 
manding. This finding is in accord with the 
anxiety-cognitive capacity model (Tobias, 1992) in that 
more demanding tasks require greater cognitive capacity 
that may not be available because of the resources ab- 
sorbed by anxiety. Further research is required to pursue 
that intriguing finding. 

In Studies V and VI, the failure of a number of anxiety 
indices to differentiate between either high school 
dropouts and continuing students, or between students 
above and below the median in GPA, was surprising. A 
meta-analysis of 562 studies dealing with test anxiety 
(Hembree, 1988) indicated that lower-achieving students 
experienced more test anxiety than did their more capable 
counterparts. While there had been no prior research 
specifically relating test anxiety to dropping out of high 
school, the bulk of the literature has indicated that stu- 
dents’ concern about their academic achievement was a 
major factor in dropping out of school, clearly suggesting 
that differences in test anxiety could be expected. As men- 
tioned above, the fact that the studies dealing with 
dropouts and vocational high school students both found 
significant differences in knowledge monitoring ability but 
neither found differences on a group of seven test-anxiety 
scales reemphasizes some of the problems with self-report 
measures, described at the beginning of this report. 

While the nonsignificant results for anxiety in Studies V 
and VI may be attributable to the small samples, or to 
other unknown factors, it should also be noted that the ten- 
dency of participants to present themselves in a more 
positive light may well have contributed to the nonsignifi- 
cant findings. One advantage of the KMA is that, because 
students do not report on either their feelings or their cog- 
nitive processes, it is difficult for them to present 
themselves more favorably. Of course, students could 
easily claim to know more words or to be able to solve 
more problems than is actually the case. However, that 
claim will be immediately challenged by administration of 
the test, making it harder for students to appear in a more 
positive light. 

Knowledge Monitoring Ability and Interest 

A good deal of recent research has addressed the effects 
of interest on learning for a variety of reasons (Renninger, 
Hidi, & Krapp, 1992). Clarification of the effects of interest 
adds to an understanding of the impact of intrinsic motiva- 
tion on learning. Interests also appear to be stable and 
long-lasting among adults (Hidi, 1990; Schiefele, 1991), 
suggesting that instruction adapted to students’ interests 
may have positive motivational effects over long periods of 
time. In addition, interests are ubiquitous in that everyone 
is interested in something. Also, findings of surprisingly 
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variable and ineffective cognitive processing (Paris, 1988; 
Tobias 1989) suggested that students’ interests or motiva- 
tion may not have been engaged by the materials used in 
the studies. Finally, research on interest provides a useful 
and educationally relevant avenue for studying the rela- 
tionship between affect and cognition (Tobias, 1989, 
1994a, b), which is needed to obtain a more complete pic- 
ture of how individuals function on a day-to-day basis. 

Research has indicated that reading comprehension and 
recall are facilitated when students work on material related 
to their interests (Renninger et al., 1992). Furthermore, 
Schiefele (1990, 1991, 1992a, b) found that comprehension 
of interesting text was “deeper” (i.e., more likely to be 
propositional than verbatim). Little is known, however, 
about the cognitive processes that mediate the effect of in- 
terest on comprehension and recall. Therefore, it was 
recommended (Renninger et ah, 1992; Tobias, 1994a) that 
research concentrate on the identification of the processes 
invoked by interest that facilitates learning. The studies re- 
ported in this section examined whether interest improved 
students’ metacognitive knowledge monitoring ability. 

Knowledge Monitoring Ability, Situational 
Interest, and Topical Interest ° 

Two types of interest, situational and topical, have 
been distinguished (Renninger et ah, 1992). Situational in- 
terest is elicited by aspects of a situation, such as its 
novelty or intensity, and by the presence of factors con- 
tributing to the attractiveness of different types of content. 
Topical interest refers to individuals’ relatively enduring 
preferences for certain topics, tasks, or contexts and how 
these influence learning. The effects of both types of in- 
terest on knowledge monitoring ability were investigated 
in this study. It was expected that subjects with greater 
topic interest to students and text that elicited situational 
interest would generate more accurate knowledge moni- 
toring. Furthermore, because interest was found to lead to 
deeper text processing (Schiefele, 1990, 1991, 1992a, b), 
it was expected that students would make more accurate 
knowledge monitoring estimates on words requiring in- 
tense processing if the material were interesting to them. 
In addition, because meanings of implicitly defined words 
have to be inferred, whereas those defined explicitly 
merely require recall of the definitions, it was reasoned 
that the meanings of implicitly defined words should be 
estimated more accurately when content was interesting 
to subjects. 

Study III Revisited 

It will be recalled that there were two groups of stu- 
dents in Study III, nursing students and college freshmen. 
Because the reading passage dealt with heart disease, it 


was expected that nursing students would have a greater 
topical interest in that material than would freshmen. Sit- 
uational interest was varied by converting the expository 
passage to a narrative format. The narrative passage con- 
tained story attributes, such as character identification 
and life themes, which, according to Hidi and Anderson 
(1992), should increase situational interest. A principal 
character was introduced in the narrative version, which 
then described his efforts to learn more about coronary 
disease because his father had developed a mild form of 
that illness. The passage indicated that he was trying to 
help his father prevent more serious coronary problems. 
This structure made it possible to include all the factual in- 
formation presented in the expository version of the 
passage. Of the 139 students in the study, 84 completed 
all the materials during the two sessions. Complete data 
were available for 33 nursing students and 5 1 freshmen. 

In Study III, an analysis of variance was performed on 
the correct metacognitive estimates (combining + + and 

scores), with the correct estimates on explicitly and 

implicitly defined words — the dependent variables — 
treated as a repeated measure. In view of the importance 
of controlling for differences in prior knowledge (Tobias, 
1994), students’ scores on the first administration of the 
vocabulary test were used as a covariate because the 
nursing students were more familiar with the heart disease 
material (prescore mean = 27.4, S.D. = 4.0) than the 
freshmen (prescore mean = 20.1, S.D. = 5.3). Because 
there was an unequal number of females in the groups (24 
of 51 freshmen and 28 of 33 nursing students), gender 
was added as a factor. Thus, the ANOVA consisted of a 
full 2 (freshmen vs. nursing students) x 2 (expository vs. 
narrative passages) x 2 (gender) factorial design, with 
prescore as a covariate. Again, the two-level repeated 
measure consisted of the number of correct estimates on 
explicitly and implicitly defined words after subjects read 
the passage. The main effect of the repeated measure was 
assessed in the “deviation” manner (Delaney and 
Maxwell, 1981). 

The ANOVA results indicated that there was a signif- 
icant overall difference between the freshmen and nursing 
students [F( 1,75) = 4.99, p < .05] favoring the nursing 
students. In addition, the mean number of correct esti- 
mates was higher for explicitly than for implicitly defined 
words fF( 1,75) = 8.27, p < .01]. None of the other main 
effects or interactions was significant. The covariate, the 
number correct on the first administration of the vocab- 
ulary test, exerted a significant effect on the dependent 
measures [F( 1,75) = 17.01, p < .001], The adjusted means 
for freshmen on correct estimates for explicitly and im- 
plicitly defined words were 13.7 and 12.5, respectively, 
and for nursing students the corresponding means were 
15.0 and 14.1. 
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These results support the general hypothesis that 
topical interest enhances metacognitive knowledge mon- 
itoring ability. As anticipated, nursing students, for 
whom the heart disease passage was more interesting, 
made more accurate metacognitive estimates of their vo- 
cabulary knowledge than did the freshmen, even when 
differences in prior knowledge of the vocabulary were 
controlled for statistically. The expected differences at- 
tributable to situational interest were not found because 
the KMA scores for the narrative and expository pas- 
sages were similar. Finally, contrary to expectations for 
both nursing students and freshmen, explicitly defined 
words were estimated more accurately than those that 
were implicitly defined. 

The absence of differences in knowledge monitoring 
ability due to situational interest may be a function of the 
similarities between the expository and narrative texts. 
Even though the passage was altered to create differences 
in situational interest, ratings of interest on a Likert-type 
scale, in the original study and on a follow-up, failed to 
show any differences between the passages. Perhaps 
greater differences in content are needed to produce dif- 
ferences in situational interest. 

Knowledge Monitoring Ability and Interest in 
Mathematics among Elementary School 
Students 

Study X found that metacognitive knowledge moni- 
toring ability in mathematics increased with grade and 
mathematical ability. A further purpose of that study 
was to examine the impact of personalizing instruction 
on metacognition. Research (Anand 8c Ross, 1987; 
Bracken, 1982; Herndon, 1987; Lopez, 1990, 1989; 
Ross & Anand, 1987; Wright & Wright, 1986) has 
shown that personalizing mathematical word problems 
by including materials such as the names of students, 
their friends, or teachers, or including materials related 
to students’ interests improved performance and atti- 
tudes toward the materials. It was, therefore, 
hypothesized that increased interest generated by per- 
sonalizing the word problems should improve students’ 
knowledge monitoring ability. 

Participants in Study X were randomly assigned to ei- 
ther interesting or control materials. In the "interesting” 
materials, the names of classmates and teachers were in- 
cluded in the math word problems, whereas the 
materials used for the control group contained standard 
names. In each set of materials, 15 mathematical word 
problems, set in the context of an ice cream store, were 
presented. Students received a menu of prices for dif- 
ferent products and were required to add and subtract 
prices of menu items. A 12-item Likert scale designed to 


assess interest in the materials was also administered. 

In this study, students’ responses were assigned a 
score of 1 for each correct estimate and 0 for each in- 
correct estimate of their knowledge. The data were then 
submitted to a 3 (grades) x 2 (gender) x 2 (interesting or 
control materials) x 3 (math ability) analysis of vari- 
ance. The findings dealing with knowledge monitoring 
ability, mathematical achievement, and grade level were 
reported previously. There were no significant differ- 
ences attributable to gender or to interest. However, 
there was an interaction between math achievement 
level, as determined by teachers’ evaluations, and in- 
terest (F = 6.02, df = 2.144, p < .01, eta 2 = .05; see 
Figure 9 for a plot of the data). 



Figure 9. Relationships between mathematical knowledge monitoring 
scores, interesting or control materials group, and math achievement group. 


The interaction, unlike the main effect found in the 
previous interest study, suggests that personalization im- 
proved the performance of low-ability math students but 
had little effect on the two other groups. In view of the 
known difficulties students have with math word prob- 
lems (NAEP, 1979), it was thought to be important to 
make the materials interesting for both groups by cre- 
ating an ice cream parlor setting. It seems possible that 
setting the math word problems in this context may have 
made the materials more interesting for both groups, 
thus leading to the insignificant main effect for interest. 
There is evidence that this setting did arouse the interest 
of all students. There were no differences (F < 1.0) be- 
tween the high- and low-interest groups on the 12-item 
Likert scale administered after students completed the 
problems. Furthermore, there were no differences be- 
tween the high- and low-interest groups in the number of 
problems solved correctly. These findings indicate that 
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even the low-interest group may have found the mate- 
rials more attractive than the math word problems 
usually presented in class and suggest that an overall ef- 
fect might be found if there were greater differences in 
interest between the groups. 

Summary: Knowledge Monitoring Ability 
and Affect 

The findings of the anxiety and interest studies indicate 
that anxiety generally seems to have a negative effect on 
metacognitive knowledge monitoring ability and that 
working on interesting materials seems to facilitate it. Fur- 
ther research is needed to answer many questions before 
these tentative conclusions can be stated with greater con- 
fidence. It seems, however, that the KMA procedure is a 
useful tool for studying the effects of affect, and especially 
of interest, on metacognition. There are a number of per- 
suasive models specifying the cognitive processes 
mediating the impact of anxiety on learning (Sarason, 
1987; Eysenck, 1988; Tobias, 1992). However, little is 
known about the cognitive processes by which such "pos- 
hive” affective variables as interest and motivation 
facilitate learning. The KMA procedure seems to be of use 
for further research relating metacognition to such posi- 
tive variables as interest or intrinsic motivation. 

Metacognitive Knowledge 
Monitoring Ability and Other 
Variables 

Two additional studies examined the relationship of 
the KMA procedure to the need for feedback and the 
KMA’s ability to differentiate between different types of 
students. These are summarized below. 

Study XI. Knowledge Monitoring Ability and 
Need for Feedback 9 

Feedback or reinforcement is one of the most widely 
studied variables in learning research. Numerous studies 
have demonstrated that feedback facilitates learning. 
McKeachie (1974) suggested that the effects of feedback 
or reinforcement on learning are not uniform but may 
vary with individuals and situations. Ashford and Cum- 
mings (1983) found that the importance of feedback 
varies with an individual’s uncertainty and Tuckman and 
Sexton (1992) found that students in a no-feedback 
situation who held high expectations for their own per- 
formance outperformed those receiving feedback, whereas 
the reverse was true for students of middle and low self- 


perceived ability. These results clearly support the idea 
that there are individual differences in the need for feed- 
back. 

It was expected that the need for feedback would de- 
pend on students’ metacognitive ability to monitor their 
knowledge-gathering activities. In an analysis similar to 
that proposed by Butler and Winne (1995), it was pro- 
posed that students with accurate knowledge monitoring 
ability probably rely more frequently on their own in- 
ternal feedback regarding the accuracy of their responses 
than do their less-accurate peers. Such students are likely 
to have learned from experience that external feedback 
often duplicates the information supplied internally and 
they therefore should require less externally supplied feed- 
back than do peers with less accurate knowledge 
monitoring ability. Therefore, when students had a choice 
of whether to obtain feedback or not, a negative relation- 
ship between KMA scores and amount of feedback was 
expected. 

Participants and Procedures 

A sample of 59 fifth-grade students (35 female) from a 
predominantly minority school participated in this study. 
A list of 25 words appropriate for fifth-grade students and 
a vocabulary test based on the same words were devel- 
oped. Participants were also given a reading test consisting 
of 11 narrative stories with an average length of 140 
words or 15 sentences. Each story had a blank to be filled 
in, and students were instructed to select a word for each 
blank from four choices appearing in the right margin. 
The words on the word list and reading test were dif- 
ferent. Participants were told that the correct answer to 
each question was printed in the left margin of each page, 
covered by a tab, and that they could look at the answers 
whenever they wished to by simply lifting the tab. Partic- 
ipants were tested individually, and the number of times 
the tabs were lifted to check the correct answer was 
recorded. 

Results and Discussion 

Students’ need for feedback was operationally defined as 
the number of times they lifted the tabs covering the correct 
answers. The KMA procedure was used to determine stu- 
dents’ accuracy in estimating their word knowledge, and 
the results were then correlated with amount of feedback 
sought. The results of that analysis are shown in Table 6. 

As expected, the results indicate that amount of feed- 
back needed is substantially related to students’ ability to 
accurately monitor their knowledge. Accuracy of knowl- 
edge monitoring was substantially and negatively related 
to amount of feedback (r = -.79, p < .001), as was the 
number of inaccurate estimates (r = .76, p < .001). Equally 
interesting was the finding that vocabulary knowledge, 
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Table 6 


Correlations Between Knowledge Monitoring Scores and 
Need for Feedback 

Knowledge Mrmitnrine Score 

l 

+ + 

-. 50 ** 

+ ~ 

.38* 

(- 

.56** 

-- 

-.13 

i- + and — 

-.79** 

-+ and +- 

.76** 

Actual score (total correct 
regardless of estimate) 

-.19 

r 

.84** 

i 

r~ 

.71 


* p < .01 

** p< .001 


determined by the number correct on the vocabulary test, 
was not significantly related to amount of feedback 
(r = -.19). The findings suggest that, as expected, students’ 
need for feedback is strongly related to their ability to ac- 
curately monitor their knowledge. Furthermore, students’ 
estimates of their knowledge were clearly the major con- 
tributor to that relationship given that actual knowledge 
was unrelated to amount of feedback. 

An equally important aspect of this study and its results 
was the fact that a new word list and vocabulary test were 
developed, different from the materials used in any of the 
other studies described in this report. Therefore, the find- 
ings also indicated that the KMA procedure has some 
generality across different types of vocabulary materials. 
Furthermore, this was the first study using a declarative 
vocabulary KMA with elementary school students, and 
the results suggest that the procedure is as applicable to 
younger students as were the mathematical materials used 
in Studies IX and X. 

Study XII. Differences in Knowledge Monitoring 
Ability among Learning-Disabled and 
Hyperactive Students 10 

It has been shown (Brown & Campione, 1986; 
Swanson & Trahan, 1992) that students diagnosed as 
Learning Disabled (LD) have lower metacognitive moni- 
toring ability than do those without special needs. 
Students with Attention Deficit Hyperactivity Disorders 
(ADHD) have been succinctly described by Douglas, Barr, 
O’Neil, and Britton (1986) as having an inability to stop, 
look, listen, and think, which also has a negative effect on 
metacognition. A review of research dealing with ADHD 
(Westby & Cutler, 1994) indicates that such students tend 


to have less effective complex problem-solving strategies 
and organizational skills, that they use less efficient strate- 
gies on memory tasks, that they “demonstrated deficits on 
all measures of study behavior. They studied for less time, 
expended less effort, and used poorer strategies... students 
with ADHD have significant deficits in executive 
processes.” (Westby & Cutler, 1994, pp. 63-64.) These 
deficits clearly suggest that ADHD students have less ef- 
fective metacognitive ability. Therefore, students 
diagnosed as LD or ADHD should have less accurate 
knowledge monitoring ability than students not affected 
by these conditions. This study tested that hypothesis. 

Participants and Procedures 

A list of 35 words and a vocabulary test based on the 
same words were developed from the high school cur- 
riculum. Participants ( N - 90) were selected from the 
ninth (N = 29) and tenth (N = 61) grades of a public high 
school in an urban area; there were 28 females and 62 
males. LD and ADHD groups (N = 30 each) were formed 
by selecting students diagnosed by a school-based support 
team consisting of a licensed educational evaluator, a 
school psychologist, and a social worker. Scores on the 
Degrees of Reading Power (DRP) (Touchstone, 1991) test 
placed these groups in the fifteenth percentile of the pop- 
ulation. A control student group (N = 30) was selected on 
the basis of demonstrating average reading ability on the 
DRP and having no history of special educational needs. 

Results and Discussion 

Three of the KMA scores ( + +, + — , and — ) were an- 
alyzed using MANOVA (the fourth score, — +, could not 
be entered due to linear dependencies), with gender and 
group as the independent variables. A significant overall 
difference among the groups was found [Wilks F(6,164) = 
5.95, p < .001, effect size = .179]. Univariate analyses in- 
dicated significant differences between the groups on + + 
scores (T(2,84) = 16.02, p < .001; control group mean = 
28.4; LD mean = 22.2; and ADHD mean = 23.0]. Uni- 
variate analyses also indicated another difference on the 

score [F(2,84) = 5.32, p < .01; control group mean = 

1.5; LD mean = 3.6; and ADHD mean = 4.3]; students in 
the control group had lower scores because they had 
fewer incorrect answers. There were no differences attrib- 
utable to gender, and no interaction between gender and 
group was found. 

A similar analysis of the number correct on the vocab- 
ulary test ( + + and — + ) also indicated significant group 
differences [F(4,166) = 7.55, p < .001, effect size = .154], 
Univariate analysis indicated that only the differences on 
the + + scores were significant; the group means were the 
same as for the preceding analysis. The results confirm 
expectations regarding differences between regular, LD, 
and ADHD students with respect to their ability to mon- 
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itor their knowledge and differentiate between what they 
know and do not know in this domain. While the results 
were similar when the dependent variable consisted only 
of the number correct on the vocabulary test, the effect 
was smaller (.154 compared to .179). As expected, the 
control group of students without special needs received 
KMA scores showing that they were better able to differ- 
entiate between the + 4- and — words than were 
students in the other two groups. 

There were large differences in reading ability between 
the groups, which may also have accounted for the group 
differences, irrespective of diagnostic category. It is often 
difficult to separate the effects of reading ability in re- 
search comparing LD, ADHD, and more traditional 
students because reading problems are one of the defining 
characteristics of the two former groups. Further research 
with similar groups may resolve this problem. In any 
event, these results provide additional support for the con- 
struct validity of the KMA procedure. In view of the fact 
that this study, like the prior one, also developed a new 
list of words and vocabulary test, the results also support 
the generality of the KMA procedure across different 
types of vocabulary materials. 

General Discussion 

The findings of the 12 studies summarized above sup- 
port the construct validity of the KMA procedure. 
Comparable results were found for samples from diverse 
student populations: elementary school students, those at- 
tending regular and vocational high schools (including 
students diagnosed as LD and ADHD), those who 
dropped out of high school, students in pre-freshman 
skills programs, and those who had attended college for 
some time. Furthermore, substantially similar results were 
obtained for procedural knowledge in mathematics, as 
well as for declarative vocabulary knowledge based on 
three different sets of vocabulary materials developed to 
be appropriate for students at elementary school through 
college levels. 

In view of the fact that the KMA may be administered 
to groups, as well as to individuals by computer, and is 
objectively scored, it seems to be a promising approach 
for the assessment of the knowledge monitoring compo- 
nent of metacognition. In addition. Studies V and VI 
indicated that the KMA made it less likely that students 
could present themselves in a more favorable light, one of 
the problems inherent in the social desirability aspect of 
self-report instruments. While no data comparing the 
KMA to other metacognitive scales have so far been col- 
lected, we expect that this measure of knowledge 
monitoring ability is likely to be more accurate than self- 
report scales because students are less able to present 


themselves in an artificially favorable way. It remains for 
further research to investigate this correlation. 

KMA relationships with external criteria were some- 
what variable. Relationships with standardized 
achievement tests were substantial and significant. For ex- 
ample, in Study I, correlations with a reading 
comprehension test were 67. Relationships with achieve- 
ment in mathematics were also substantial in Study IX (r 
= .76); and in Study X, highly significant effects were 
found for KMA differences in students’ math achievement 
(eta 2 = .26) and for increases in mathematical ability over 
these elementary school grades (eta 2 = .18). Pintrich (in 
press) cites some of these findings as being among the 
most positive relationships identified between any 
metacognitive measure and external criteria. Relation- 
ships with need for feedback (Study XI) were also found 
to be substantial (r = .62). Significant, though somewhat 
more moderate, relationships were found in those studies 
in which the KMA differentiated between divergent 
groups such as regular students and dropouts (Study VI), 
or among LD, ADHD, and students without special needs 
(Study XII). Generally, the lowest, though frequently sig- 
nificant, relationships were found between KMA scores 
and college grades. Presumably, as indicated previously, 
the low reliability of such grades accounts for the modest 
effects. Further, differences between the effects of knowl- 
edge monitoring estimates and actual knowledge 
discussed below should also be considered. 

A number of issues raised by the results require further 
research. These include the following: Do multiple ad- 
ministrations of the KMA procedure strengthen its 
relationship with other variables? Which of the different 
scores are optimal indicators of knowledge monitoring 
ability? Do estimates of knowledge account for more vari- 
ance than actual knowledge? These questions are 
addressed below. 

The KMA Procedure and Dynamic Assessment 

Some of the studies described above administered the 
reading passage to only a portion of the sample, others did 
not use the passage at all, and still others administered a 
word list and vocabulary test before and after students 
read a passage from which the word meanings could be 
inferred. A question arises about the value of interjecting 
the reading passage between administrations of the word 
list and vocabulary test. Giving students a chance to im- 
prove their knowledge has some similarities to dynamic 
assessment approaches (see Carlson & Wiedl, 1992; 
Guthke, 1992; Lidz, 1992), in which students are given 
new learning opportunities before being tested. Dynamic 
assessment procedures usually also include as part of the 
assessment process some intervention in students’ at- 
tempts to learn, observations of their reactions to the 
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intervention, and an evaluation of their responses to the 
assistance. Reviews have suggested (Carlson & Wiedl, 
1992) that students’ attempts to verbalize their learning 
difficulties, and receiving elaborate feedback about their 
efforts, contribute heavily to the value of dynamic assess- 
ment. The KMA differs from dynamic assessment 
procedures because it does not include any of these addi- 
tional efforts to facilitate learning; students are merely 
given a second opportunity to learn the words by reading 
a passage, without any other assistance. 

The results of the present research indicate that the op- 
portunity to learn the meanings of some words from the 
passage was most important only in Study I, relating the 
KMA procedure to reading comprehension, and seemed to 
have little effect on studies of learning in college or estima- 
tion of actual performance. The findings indicate that, with 
the possible exception of relationships with reading com- 
prehension, use of the word list and vocabulary test alone 
appear to be effective in estimating metacognitive knowl- 
edge monitoring ability, whether the reading passage is 
used or not. 

The distinction between explicitly and implicitly de- 
fined words was expected to be useful only in those 
studies in which students read the passage. The results of 
those investigations indicated that there were few differ- 
ences between these two types of words. Since neither the 
use of the passage nor the distinction between the two 
types of words appeared to affect the results, it seems pru- 
dent to abandon both those approaches in future research. 

Implications for Teaching and Research 

The results indicated that use of the reading passage 
did not add much explanatory power to the KMA as an 
appraisal instrument. It may, nevertheless, be interesting 
to use the passage in future research to study the applica- 
bility of the KMA for research on teaching students how 
to monitor their own knowledge. If the word list and vo- 
cabulary test are used as pre-post measures, the passage 
could be interjected to help students learn the meanings of 
those words about which they had made incorrect esti- 
mates of knowledge. Different levels of instructional 
support (Tobias, 1989) could be used to help students 
learn the meanings of the words they had estimated in- 
correctly. 

Use of the reading passage makes it possible to imple- 
ment a teaching strategy featuring maximal prompting, in 
the form of very active instructional interventions at the 
beginning and fading that out until the passage alone is 
presented without any prompts. The interventions could 
include such procedures as urging students to provide de- 
finitions or synonyms for the words, asking them to 
rephrase the clauses containing the target words, asking 
questions about the words and cuing students that the 


target words are especially important and that they should 
pay special attention to them. Of course, research would 
have to determine whether the suggested interventions 
actually constitute a hierarchy ranging from maximal to 
minimal instructional support. It should also be noted that 
a number of passages, with associated word lists and 
vocabulary tests, may be needed to develop effective strate- 
gies for teaching students how to monitor their own 
knowledge. Once research has determined the usefulness 
of the procedures outlined above, they could become an 
important resource to help teachers at all levels improve 
the knowledge monitoring ability of their students. 

In addition to the possible value for teaching of the in- 
structional interventions described above, interjecting 
reading passages or content material could make the 
KMA more similar to dynamic types of assessment and to 
actual classroom learning. Research could then determine 
whether such interventions improve the relationship of 
the KMA procedure to classroom learning. Giving 
students new learning opportunities before adminis- 
tering, or re-administering, the KMA procedure is likely 
to be more complex in mathematics or science than for 
word knowledge. Dynamic assessment in these fields 
would probably require very active instructional inter- 
ventions before students show increases in knowledge, 
because few learners can master new material in science 
or mathematics merely by reading a passage and working 
on problems, or even when assisted by the types of inter- 
ventions suggested above. 

Optimal Indicators of the Latent Knowledge 
Monitoring Construct 

Metacognitive knowledge monitoring ability is a latent 
construct inferred from the various scores generated by 
the KMA procedure. Many of the preceding studies com- 
bined the ++ and — scores to develop a measure of 
knowledge monitoring ability. The combined score 
seemed to have face validity as the most direct and most 
theoretically interesting index of knowledge monitoring 

ability. Furthermore, by including the scores, the 

combined total seemed independent of students’ actual 
knowledge, because the combined estimate included items 
answered incorrectly. Scores based on the signal detection 
paradigm were used in Study II, but seemed to add little 

to the combination of ++ and scores used in the 

other studies. However, the findings of some of the inves- 
tigations, especially Studies VII and VIII, suggest that 
differences between groups were obscured when the sub- 
scores for different categories ( + +, + — , — + , and — for 
words defined explicitly or implicitly) were combined. 

Ideally, the optimal KMA score should be determined 
empirically, rather than on the basis of its face validity. 
The four subscores, or eight if the explicit-implicit distinc- 


30 



Table 7 


tion is used, should be submitted to procedures such as the 
analysis of covariance matrices in order to determine 
which scoreis) are optimal indicators of the latent knowl- 
edge monitoring construct. Further research is clearly 
needed with larger samples (perhaps 200 to 300 students) 
to obtain some stability for the results. The data should 
then be analyzed with structural equation modeling tech- 
niques or comparable procedures to identify empirically 
the optimal scoring device for the latent knowledge mon- 
itoring construct. 

Actual Knowledge and Estimates of Knowledge 

Research has indicated that vocabulary test scores are 
one of the most powerful predictors of classroom learning 
(Breland, Jones, 3c Jenkins, 1994; Just & Carpenter, 
1987). KMA scores combine both students’ estimates of 
what they know and their actual knowledge. Thus the + + 
score is a composite of both actual word knowledge, de- 
termined by the raw score on the vocabulary test, and the 
students’ correct estimates of that knowledge. Each of the 
studies described above examined whether the KMA esti- 
mates contributed independent information beyond that 
accounted for by students’ actual word knowledge. Oper- 
ationally, this question was analyzed by comparing the 
variance accounted for by correct estimates ( -1- + and — 
combined) with the variance accounted for using only the 
number correct on the vocabulary test ( + + added to - + ). 
Table 7 summarizes these results for each of the studies. 

Table 7 indicates that in Studies V, VII, and VIII (four 
comparisons), actual knowledge alone, determined by 
raw score on the vocabulary test, accounted for more vari- 
ance (ranging from 1—1 7 percent) than did the estimates. 
Also, there seemed to be little difference between actual 
knowledge and estimates in Study III. When college stu- 
dents’ estimates of their knowledge in Introductory 
Psychology courses were related to their AP Psychology 
scores, the effect size for actual knowledge was 13 percent 
(Study VII) and 17 percent (Study VIII) greater than for 
estimates of knowledge. When relationships between in- 
dices of Introductory Psychology students’ in-class 
performance and KMA scores were analyzed (Study VIII), 
the effect size for knowledge alone was 6 percent greater. 

It is not unusual for knowledge of vocabulary, even in 
an unrelated domain, to be an important predictor of stu- 
dents’ grades in college exams, such as the multiple-choice 
test and the AP Examination administered in Studies VII 
and VIII. Vocabulary scores based on words not directly 
related to a particular course curriculum have been shown 
to be powerful predictors of all types of classroom 
learning (Breland, Jones, & Jenkins, 1994; Just Sc Car- 
penter, 1987). Thus, findings that such scores were highly 
related to how much students learned in a psychology 
course (determined by either the AP Exam or in-class 


Summary Comparing Knowledge Monitoring Scores 
(KMA) and Raw Scores 

S tudy 

I KMA accounted for 4% more variance than raw scores. 

!I KMA accounted for 5% more variance than raw scores. 

HI Correlations similar for KMA and raw scores. 

IV' Combined GPA- differentiated KMA scores, effect size = .07, 

raw scores NS (effect size = .03). English CPA differentiated 
KMA scores, effect size = .07, raw scores .06. Humanities GPA 
differentiated KMA scores, effect size = .09, raw scores .05. 

V Vocational high school low and high GPA groups differed on 
KMA, effect size = .14, and on raw scores, effect size = .16. 
Predicted score before and after taking test and actual final 
exam score = ns for KMA and raw scores. 

VI Difference between high school students and dropouts greater 
on KMA, effect size = .13, than on raw scores, effect size = .10. 

VII AP data and final grade related to KMA, effect size = .14 and 
raw scores, effect size = .27. 

VIII AP data and final grade related ro KMA, effect size = .16, and 
raw scores, effect size = .33. Class test data related to KMA, 
effect size = . IT - , and raw scores, effect size = .23. 

IX KMA r- with Metropolitan score = .58, raw score = .27. 

X* 

XI Estimates r- with need for feedback = .62, raw score = .04 (ns). 

XII Differences between regular, LD and ADHD Students greater 
with KMA than raw scores, effect size = .18 compared to .15 
for raw scores. 


NS = Nonsignificant. 

* = Could not be determined. 

tests) were not surprising. Furthermore, because students 
had little prior experience with the content of the AP Ex- 
amination, they had no basis for estimating their 
performance on that test. In such instances, it is therefore 
not unreasonable that actual knowledge may be more im- 
portant in determining students’ achievement than 
estimates of that knowledge. 

Estimates of knowledge accounted for more variance in 
seven of the studies (nine comparisons, ranging in effect 
size or r 1 from 1 percent-58 percent, with a median of 4 
percent more variance), compared with actual knowledge. 
The largest differences occurred in the investigation of 
need for feedback (Study XI), in which the raw score on 
the vocabulary test accounted for an insignificant 4 per- 
cent of the variance, while accurate knowledge monitoring 
estimates accounted for a highly significant 62 percent of 
the variance! Of course, that finding should be replicated 
with larger samples. Nevertheless, it seems reasonable that 
students’ need for feedback should relate more strongly to 
their estimates of their knowledge rather than to actual 
knowledge. 

Another large difference between the contributions of 
estimated and actual knowledge scores occurred in Study 
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IX, one of the math studies. Students’ estimates of the 
number of problems they could solve accounted for 31 
percent more variance than did the number of problems 
actually solved. The findings of Study IX were replicated 
in Study X; unfortunately, a computer malfunction made 
it impossible to compare the estimated and actual scores 
in that investigation. While the math studies clearly need 
replication, the findings suggest that estimates of knowl- 
edge may be more powerful predictors of achievement in 
that domain than in word knowledge. 

One possible reason for the substantial effects in math- 
ematics compared to word knowledge may deal with 
domain similarity. That is, estimates of knowledge in 
math were made with respect to content that was highly 
similar to the types of problems encountered in actual 
math courses. As indicated previously, the vocabulary 
words used in many of the studies were not similar to the 
words presented in actual courses, perhaps leading to 
somewhat weaker effects. That interpretation is supported 
by findings in several of the investigations. In Study I, re- 
lating actual declarative word knowledge and estimates of 
that knowledge to reading comprehension, the strongest 
relationships were found for KMA scores after students 
had read the passage in which the vocabulary words were 
defined. That sequence was obviously very similar to the 
task students face in reading comprehension tests. In ad- 
dition, it will be recalled that in Study III, social science 
and science had the lowest relationships with KMA 
scores, and that in Study III, the effects for social science 
and science were insignificant. Since the KMA materials 
were developed to be quite general, they were probably 
dissimilar to the types of materials with which students 
are presented in these more technical subjects. These re- 
sults suggest that the KMA has stronger effects within a 
domain, rather than across domains. Schraw, Dunkle, 
Bendixen, and Roedel (1995) found that knowledge mon- 
itoring had both domain-specific and domain-general 
attributes. Further research is needed to clarify the do- 
main-specific and/or domain-general characteristics of the 
KMA procedure. 

Another possible explanation for the more positive re- 
sults in the studies involving mathematics relates to the 
perceived difficulty of the subject. Everson, Tobias, 
Hartman, and Gourgey (1993) found that students per- 
ceive mathematics to be the second most difficult subject, 
right after science. Conceivably, as suggested below, stu- 
dents’ estimates of their knowledge in more difficult 
subjects are less automatic and involve more reflection 
about their prior experiences than in less difficult subjects. 
Students’ confidence in and/or their anxiety about these 
subject areas may also affect their estimates. Further re- 
search is needed using materials drawn from mathematics, 
science, and other technical fields to study both this ques- 
tion and the issue of domain specificity. 


The KMA Procedure and Degree of Difficulty 

Little information about the difficulty of the various 
vocabulary and math materials was available prior to 
their use in any of the studies. This may well have con- 
tributed to some of the varying results. It seems reasonable 
that estimates of knowledge based on students’ thoughtful 
consideration of what they know and don’t know would 
be more substantially related to other variables than esti- 
mates made more or less automatically. Rapid answers 
made with little reflection are most likely when students 
respond to materials that are very easy for them. Wrong 
estimates based on such relatively automatic responses 
probably indicate careless errors, rather than a failure to 
seriously consider the estimate. More difficult materials 
may also evoke nonreflective responses, since students 
may feel that they neither know nor care about what the 
correct answers to such questions are. Items of moderate 
difficulty, about which students may have partial knowl- 
edge that can be extended by exerting some effort, would 
appear to be most likely to elicit well-considered responses 
that correctly reflect students’ knowledge monitoring 
ability. 

Item difficulty is also of importance in considering the 
different KMA scores. In the studies described above, of 
the four scores generated by the KMA procedure, the 
greatest number of responses fell into the + + category. It 
may be assumed that more difficult items would yield 
more — and — + responses, increasing the reliability of 
these items and the likelihood that they could explain 
more of the variance between students with high knowl- 
edge monitoring ability and their less able peers. 
Furthermore, having more items in the — category re- 
duces the agreement between estimates and number 
correct for two reasons: First, such responses represent ac- 
curate estimates but no knowledge about the item, and 

second, more items allows for a smaller percentage of 

+ + items. 

In future research, these expectations about the effects 
of varying degrees of item difficulty should be tested by 
using items with a previously determined range of diffi- 
culty. It could be hypothesized that the most useful 
metacognitive knowledge estimates are likely to be gener- 
ated from materials of moderate difficulty, and that more 
difficult items will increase the difference between the ac- 
curacy of knowledge monitoring estimates and the 
demonstrated knowledge in a domain. 

Relationship of the KMA to 
Metamemory Research 

The KMA procedure described in this report is similar 
to metamemory research on the feeling of knowing (FOK) 
and judgment of learning (JOL). FOK judgments “occur 
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during or after acquisition and are judgments about 
whether a given currently nonrecallable item is known 
and/or will be remembered on a subsequent retention 
test.... Judgments of learning (JOL) occur during or after 
acquisition and are predictors about future test perfor- 
mance on currently recallable items” (Nelson & Nahrens, 
1990, p. 130). In terms of this definition, for the studies 
reviewed in this report, students’ estimates of their ability 
with respect to both the word list and the math problems 
were similar to JOLs. 

FOK research was originated by Hart (1965), who 
asked general information questions of students who, 
after failing to recall an item, had to make a judgment re- 
garding their FOK about that item. Finally, they were 
asked to select an answer from a set of distractors. The 
procedure has been extended to asking students to guess 
if they could recall words learned in a paired-associate 
task (Hart, 1967; Ryan, Petty, & Wentzlaff, 1982). 
Nelson, Gerler, and Nahrens (1984) also extended the 
FOK research to students’ ability to relearn, and to tasks 
in which students were asked to identify perceptual 
stimuli. Reder and Ritter (1992) investigated whether stu- 
dents opted either to retrieve or re-calculate mathematical 
problems, and both the time taken to solve the problems 
and accuracy of the processes were studied. A review of 
FOK research indicated that “a large number of studies 
confirmed that (students)... unable to retrieve a solicited 
item from memory can estimate with above-chance suc- 
cess whether they will be able to recall it in the future, 
produce it in response to clues, or identify it among dis- 
tractors.... The standard finding is that the predictive 
validity of FOK judgments is above-chance, though far 
from perfect” (Koriat, 1993, p. 609-610). 

The FOK and JOL paradigms differ from the present 
research in a number of ways. First, FOK judgments are 
typically elicited following a failure to recall, rather than 
after every administration of material. Second, in FOK or 
JOL research usually no attempts are made to enable stu- 
dents to improve their knowledge, as was done in some of 
the studies reviewed in this report. Third, the purposes of 
metamemory research are to clarify the mechanisms ac- 
counting for FOK and JOL, rather than to use the results 
as measures of metacognitive knowledge monitoring 
ability to be related to variables of importance in class- 
room learning. 

Suggestions for Further Research 

A number of recommendations for further research 
have already been made; additional suggestions that do 
not pertain directly to the previous discussion are pre- 
sented here. The positive findings relating knowledge 
monitoring ability to need for feedback suggest that 
studies of similar variables relating the KMA procedure to 


processes relevant to classroom learning may be fruitful. 
For example, forgetting what has been learned in school 
may be related to knowledge monitoring ability. It could 
be inferred that students with high knowledge monitoring 
ability, by having a clear sense of what they know and do 
not know, may be able to retrieve more prior learning 
than those who have a less secure grasp of what they 
know and do not know and who, hence, may have greater 
difficulty retrieving prior learning. A pilot study of the 
knowledge monitoring-forgetting relationship provided 
substantial support for that reasoning and will soon be 
followed up. 

The relationship between knowledge monitoring 
ability and the effect of distractibility is another fruitful 
area for investigation. Even though a great deal of anec- 
dotal evidence suggests that students are readily distracted 
from their studies, it has been surprisingly difficult to di- 
vert students in investigations specifically designed for 
that purpose (Slater, 1968; Tobias, 1973). While part of 
the problem may be attributable to motivational phe- 
nomena, i.e., the interest level of both the primary and 
distracting materials seems to be important in determining 
whether students are successfully diverted from their 
studies (Tobias, 1973), students’ knowledge monitoring 
ability may also help to determine whether students are 
distracted. Students with an accurate grasp of their knowl- 
edge would be expected to find distractions less disruptive 
than those with a hazier notion of what they know and do 
not know. 

Research should also be conducted relating knowledge 
monitoring ability to depth of knowledge processing 
(Craik & Lockhart, 1972). Students should be able to 
distinguish between the known and unknown more ac- 
curately if the knowledge was processed at a deep rather 
than shallow level. Deeper processing should enhance 
students’ knowledge monitoring ability, and it could be 
predicted that students will make more accurate distinc- 
tions between the known and unknown on material that 
they are induced to process deeply, either by experi- 
mental manipulations or instructions, rather than at a 
shallow level. 

Learning in complex domains such as science and engi- 
neering, or making diagnoses in medicine or other fields, 
often requires that students bring substantial amounts of 
prior learning to bear in order to understand and acquire 
new knowledge and/or solve problems. Some prior 
learning may be recalled imperfectly, or may never have 
been completely mastered during initial acquisition. Stu- 
dents who can accurately distinguish between what they 
know and don’t know should be at an advantage while 
working in such domains, since they are more likely to re- 
view and try to relearn imperfectly mastered materials 
needed for particular tasks compared with those who are 
less accurate in estimating their own knowledge. 
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Further research is also needed to determine the rela- 
tionships between the KMA procedure and self-report 
measures of metacognition, study skills, and self-regula- 
tion. These constructs have some similarity to the KMA 
procedure and positive relationships should be obtained. 
Finally, the relationship between knowledge monitoring 
ability and measures of intelligence should be investigated. 
Sternberg (1991) has suggested that metacognition should 
be a component of intelligence tests; presumably those 
who consider metacognition an executive process 
(Borkowski, Chan and Muthukrishna, in press) would 
also agree with that recommendation. Research findings 
(Schraw, in press) indicate that academically able students 
have higher knowledge monitoring ability than those less 
able. Therefore, positive relationships between the KMA 
procedure and measures of general intellectual ability may 
be expected. 

Endnotes 

1. Study I was presented at the annual convention of 
the American Psychological Association, in San 
Francisco, August 1991. That paper was coauthored 
by S. Tobias, H. Flartman, FI. Everson, H. &c A. 
Gourgey, see References. 

2. This study, by Howard Everson, Ivan Smodlaka, 

and Sigmund Tobias, was published was published 
in Stress, Anxiety, and Coping , 1994, see 

References. 

3. A paper based on Studies III and IV was presented 
at the annual meeting of the American Educational 
Research Association in San Francisco, April 1995. 

4. The data for this study were collected by Deno 
Charalambous. 

5 The data for this study were collected by Heather 
Gerriry. 

6. The data for this study were collected by Dhalma 
Rosado. This investigation was presented as part of 
a paper at the annual convention of the American 
Educational Research Association in San Francisco, 
April, 1995. 

7. The data for this study were collected by Audrey 
D’Agostino. The study was part of a paper presented 
at the annual convention of the American 
Educational Research Association in New Orleans, 
April, 1994. 

8. This study, conducted by Sigmund Tobias, was 
published in the journal of Educational Psychology, 
1 995, see References. 

9. The data for this study were collected by Nadia 
Seignon. 

10. The data for this study were collected by Julie 
Wilson. 
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